-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LAA] Use SCEVUse to add extra NUW flags to pointer bounds. (WIP) #91962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/fhahn/scevuse
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesUse SCEVUse to add a NUW flag to the upper bound of an accessed pointer. By adding the use-specific NUW flag, we can detect cases where SCEV can Note that this depends c2895cd27fbf200d1da056bc66d77eeb62690bf0, which Depends on #91961 Patch is 35.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91962.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index 6ebd0fb8477a0..0663ef6a2865d 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -372,10 +372,10 @@ struct RuntimeCheckingPtrGroup {
/// The SCEV expression which represents the upper bound of all the
/// pointers in this group.
- const SCEV *High;
+ SCEVUse High;
/// The SCEV expression which represents the lower bound of all the
/// pointers in this group.
- const SCEV *Low;
+ SCEVUse Low;
/// Indices of all the pointers that constitute this grouping.
SmallVector<unsigned, 2> Members;
/// Address space of the involved pointers.
@@ -413,10 +413,10 @@ class RuntimePointerChecking {
TrackingVH<Value> PointerValue;
/// Holds the smallest byte address accessed by the pointer throughout all
/// iterations of the loop.
- const SCEV *Start;
+ SCEVUse Start;
/// Holds the largest byte address accessed by the pointer throughout all
/// iterations of the loop, plus 1.
- const SCEV *End;
+ SCEVUse End;
/// Holds the information if this pointer is used for writing to memory.
bool IsWritePtr;
/// Holds the id of the set of pointers that could be dependent because of a
@@ -429,7 +429,7 @@ class RuntimePointerChecking {
/// True if the pointer expressions needs to be frozen after expansion.
bool NeedsFreeze;
- PointerInfo(Value *PointerValue, const SCEV *Start, const SCEV *End,
+ PointerInfo(Value *PointerValue, SCEVUse Start, SCEVUse End,
bool IsWritePtr, unsigned DependencySetId, unsigned AliasSetId,
const SCEV *Expr, bool NeedsFreeze)
: PointerValue(PointerValue), Start(Start), End(End),
@@ -443,8 +443,10 @@ class RuntimePointerChecking {
/// Reset the state of the pointer runtime information.
void reset() {
Need = false;
+ AlwaysFalse = false;
Pointers.clear();
Checks.clear();
+ CheckingGroups.clear();
}
/// Insert a pointer and calculate the start and end SCEVs.
@@ -501,6 +503,8 @@ class RuntimePointerChecking {
/// This flag indicates if we need to add the runtime check.
bool Need = false;
+ bool AlwaysFalse = false;
+
/// Information about the pointers that may require checking.
SmallVector<PointerInfo, 2> Pointers;
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index e1d0f900c9050..208c034763f7c 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -210,8 +210,8 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
bool NeedsFreeze) {
ScalarEvolution *SE = PSE.getSE();
- const SCEV *ScStart;
- const SCEV *ScEnd;
+ SCEVUse ScStart;
+ SCEVUse ScEnd;
if (SE->isLoopInvariant(PtrExpr, Lp)) {
ScStart = ScEnd = PtrExpr;
@@ -223,6 +223,8 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
ScStart = AR->getStart();
ScEnd = AR->evaluateAtIteration(Ex, *SE);
const SCEV *Step = AR->getStepRecurrence(*SE);
+ if (auto *Comm = dyn_cast<SCEVCommutativeExpr>(ScEnd))
+ ScEnd = SCEVUse(ScEnd, 2);
// For expressions with negative step, the upper bound is ScStart and the
// lower bound is ScEnd.
@@ -244,7 +246,10 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
auto &DL = Lp->getHeader()->getModule()->getDataLayout();
Type *IdxTy = DL.getIndexType(Ptr->getType());
const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
- ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV);
+ // TODO: this computes one-past-the-end. ScEnd + EltSizeSCEV - 1 is the last
+ // accessed byte. Not entirely sure if one-past-the-end must also not wrap? If
+ // it does, could compute and use last accessed byte instead.
+ ScEnd = SCEVUse(SE->getAddExpr(ScEnd, EltSizeSCEV), 2);
Pointers.emplace_back(Ptr, ScStart, ScEnd, WritePtr, DepSetId, ASId, PtrExpr,
NeedsFreeze);
@@ -379,6 +384,11 @@ SmallVector<RuntimePointerCheck, 4> RuntimePointerChecking::generateChecks() {
if (needsChecking(CGI, CGJ)) {
tryToCreateDiffCheck(CGI, CGJ);
Checks.push_back(std::make_pair(&CGI, &CGJ));
+ if (SE->isKnownPredicate(CmpInst::ICMP_UGT, CGI.High, CGJ.Low) &&
+ SE->isKnownPredicate(CmpInst::ICMP_ULE, CGI.Low, CGJ.High)) {
+ AlwaysFalse = true;
+ return {};
+ }
}
}
}
@@ -635,8 +645,7 @@ void RuntimePointerChecking::print(raw_ostream &OS, unsigned Depth) const {
const auto &CG = CheckingGroups[I];
OS.indent(Depth + 2) << "Group " << &CG << ":\n";
- OS.indent(Depth + 4) << "(Low: " << *CG.Low << " High: " << *CG.High
- << ")\n";
+ OS.indent(Depth + 4) << "(Low: " << CG.Low << " High: " << CG.High << ")\n";
for (unsigned J = 0; J < CG.Members.size(); ++J) {
OS.indent(Depth + 6) << "Member: " << *Pointers[CG.Members[J]].Expr
<< "\n";
@@ -1274,6 +1283,7 @@ bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
// If we can do run-time checks, but there are no checks, no runtime checks
// are needed. This can happen when all pointers point to the same underlying
// object for example.
+ CanDoRT &= !RtCheck.AlwaysFalse;
RtCheck.Need = CanDoRT ? RtCheck.getNumberOfChecks() != 0 : MayNeedRTCheck;
bool CanDoRTIfNeeded = !RtCheck.Need || CanDoRT;
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 320be6f26fc0a..04d7450e6cc2e 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -11228,8 +11228,7 @@ bool ScalarEvolution::isKnownPredicateViaNoOverflow(ICmpInst::Predicate Pred,
XNonConstOp = X;
XFlagsPresent = ExpectedFlags;
}
- if (!isa<SCEVConstant>(XConstOp) ||
- (XFlagsPresent & ExpectedFlags) != ExpectedFlags)
+ if (!isa<SCEVConstant>(XConstOp))
return false;
if (!splitBinaryAdd(Y, YConstOp, YNonConstOp, YFlagsPresent)) {
@@ -11238,13 +11237,22 @@ bool ScalarEvolution::isKnownPredicateViaNoOverflow(ICmpInst::Predicate Pred,
YFlagsPresent = ExpectedFlags;
}
- if (!isa<SCEVConstant>(YConstOp) ||
- (YFlagsPresent & ExpectedFlags) != ExpectedFlags)
+ if (YNonConstOp != XNonConstOp)
return false;
- if (YNonConstOp != XNonConstOp)
+ if (!isa<SCEVConstant>(YConstOp))
return false;
+ if (YNonConstOp != Y && ExpectedFlags == SCEV::FlagNUW) {
+ if ((YFlagsPresent & ExpectedFlags) != ExpectedFlags)
+ return false;
+ } else {
+ if ((XFlagsPresent & ExpectedFlags) != ExpectedFlags)
+ return false;
+ if ((YFlagsPresent & ExpectedFlags) != ExpectedFlags)
+ return false;
+ }
+
OutC1 = cast<SCEVConstant>(XConstOp)->getAPInt();
OutC2 = cast<SCEVConstant>(YConstOp)->getAPInt();
@@ -11878,6 +11886,8 @@ bool ScalarEvolution::splitBinaryAdd(SCEVUse Expr, SCEVUse &L, SCEVUse &R,
L = AE->getOperand(0);
R = AE->getOperand(1);
Flags = AE->getNoWrapFlags();
+ Flags = setFlags(AE->getNoWrapFlags(),
+ static_cast<SCEV::NoWrapFlags>(Expr.getInt()));
return true;
}
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
index cd388b4ee87f2..2f9f6dc39b19d 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
@@ -24,14 +24,14 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP1]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)(u nuw))
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; CHECK-NEXT: Group [[GRP2]]:
-; CHECK-NEXT: (Low: %Base1 High: (400 + %Base1))
+; CHECK-NEXT: (Low: %Base1 High: (400 + %Base1)(u nuw))
; CHECK-NEXT: Member: {%Base1,+,4}<nw><%loop>
; CHECK-NEXT: Group [[GRP3]]:
-; CHECK-NEXT: (Low: %Base2 High: (400 + %Base2))
+; CHECK-NEXT: (Low: %Base2 High: (400 + %Base2)(u nuw))
; CHECK-NEXT: Member: {%Base2,+,4}<nw><%loop>
; CHECK-EMPTY:
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -58,14 +58,14 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
; RECURSE-NEXT: %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP4]]:
-; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest)(u nuw))
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; RECURSE-NEXT: Group [[GRP5]]:
-; RECURSE-NEXT: (Low: %Base1 High: (400 + %Base1))
+; RECURSE-NEXT: (Low: %Base1 High: (400 + %Base1)(u nuw))
; RECURSE-NEXT: Member: {%Base1,+,4}<nw><%loop>
; RECURSE-NEXT: Group [[GRP6]]:
-; RECURSE-NEXT: (Low: %Base2 High: (400 + %Base2))
+; RECURSE-NEXT: (Low: %Base2 High: (400 + %Base2)(u nuw))
; RECURSE-NEXT: Member: {%Base2,+,4}<nw><%loop>
; RECURSE-EMPTY:
; RECURSE-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -132,16 +132,16 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
; CHECK-NEXT: %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP7]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)(u nuw))
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP8]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP9]]:
-; CHECK-NEXT: (Low: %Base2 High: (400 + %Base2))
+; CHECK-NEXT: (Low: %Base2 High: (400 + %Base2)(u nuw))
; CHECK-NEXT: Member: {%Base2,+,4}<nw><%for.body>
; CHECK-NEXT: Group [[GRP10]]:
-; CHECK-NEXT: (Low: %Base1 High: (400 + %Base1))
+; CHECK-NEXT: (Low: %Base1 High: (400 + %Base1)(u nuw))
; CHECK-NEXT: Member: {%Base1,+,4}<nw><%for.body>
; CHECK-EMPTY:
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -171,16 +171,16 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
; RECURSE-NEXT: %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP11]]:
-; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest)(u nuw))
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP12]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP13]]:
-; RECURSE-NEXT: (Low: %Base2 High: (400 + %Base2))
+; RECURSE-NEXT: (Low: %Base2 High: (400 + %Base2)(u nuw))
; RECURSE-NEXT: Member: {%Base2,+,4}<nw><%for.body>
; RECURSE-NEXT: Group [[GRP14]]:
-; RECURSE-NEXT: (Low: %Base1 High: (400 + %Base1))
+; RECURSE-NEXT: (Low: %Base1 High: (400 + %Base1)(u nuw))
; RECURSE-NEXT: Member: {%Base1,+,4}<nw><%for.body>
; RECURSE-EMPTY:
; RECURSE-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -232,16 +232,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
; CHECK-NEXT: %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP15]]:
-; CHECK-NEXT: (Low: %Dest High: (800 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (800 + %Dest)(u nuw))
; CHECK-NEXT: Member: {%Dest,+,8}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP16]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP17]]:
-; CHECK-NEXT: (Low: %Base2 High: (800 + %Base2))
+; CHECK-NEXT: (Low: %Base2 High: (800 + %Base2)(u nuw))
; CHECK-NEXT: Member: {%Base2,+,8}<nw><%for.body>
; CHECK-NEXT: Group [[GRP18]]:
-; CHECK-NEXT: (Low: %Base1 High: (800 + %Base1))
+; CHECK-NEXT: (Low: %Base1 High: (800 + %Base1)(u nuw))
; CHECK-NEXT: Member: {%Base1,+,8}<nw><%for.body>
; CHECK-EMPTY:
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -271,16 +271,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
; RECURSE-NEXT: %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP19]]:
-; RECURSE-NEXT: (Low: %Dest High: (800 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (800 + %Dest)(u nuw))
; RECURSE-NEXT: Member: {%Dest,+,8}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP20]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP21]]:
-; RECURSE-NEXT: (Low: %Base2 High: (800 + %Base2))
+; RECURSE-NEXT: (Low: %Base2 High: (800 + %Base2)(u nuw))
; RECURSE-NEXT: Member: {%Base2,+,8}<nw><%for.body>
; RECURSE-NEXT: Group [[GRP22]]:
-; RECURSE-NEXT: (Low: %Base1 High: (800 + %Base1))
+; RECURSE-NEXT: (Low: %Base1 High: (800 + %Base1)(u nuw))
; RECURSE-NEXT: Member: {%Base1,+,8}<nw><%for.body>
; RECURSE-EMPTY:
; RECURSE-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -332,16 +332,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
; CHECK-NEXT: %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP23]]:
-; CHECK-NEXT: (Low: %Dest High: (399 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (399 + %Dest)(u nuw))
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP24]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP25]]:
-; CHECK-NEXT: (Low: %Base2 High: (399 + %Base2))
+; CHECK-NEXT: (Low: %Base2 High: (399 + %Base2)(u nuw))
; CHECK-NEXT: Member: {%Base2,+,4}<nw><%for.body>
; CHECK-NEXT: Group [[GRP26]]:
-; CHECK-NEXT: (Low: %Base1 High: (399 + %Base1))
+; CHECK-NEXT: (Low: %Base1 High: (399 + %Base1)(u nuw))
; CHECK-NEXT: Member: {%Base1,+,4}<nw><%for.body>
; CHECK-EMPTY:
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -371,16 +371,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
; RECURSE-NEXT: %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP27]]:
-; RECURSE-NEXT: (Low: %Dest High: (399 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (399 + %Dest)(u nuw))
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP28]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP29]]:
-; RECURSE-NEXT: (Low: %Base2 High: (399 + %Base2))
+; RECURSE-NEXT: (Low: %Base2 High: (399 + %Base2)(u nuw))
; RECURSE-NEXT: Member: {%Base2,+,4}<nw><%for.body>
; RECURSE-NEXT: Group [[GRP30]]:
-; RECURSE-NEXT: (Low: %Base1 High: (399 + %Base1))
+; RECURSE-NEXT: (Low: %Base1 High: (399 + %Base1)(u nuw))
; RECURSE-NEXT: Member: {%Base1,+,4}<nw><%for.body>
; RECURSE-EMPTY:
; RECURSE-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -432,16 +432,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
; CHECK-NEXT: %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP31]]:
-; CHECK-NEXT: (Low: %Dest High: (100 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (100 + %Dest)(u nuw))
; CHECK-NEXT: Member: {%Dest,+,1}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP32]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP33]]:
-; CHECK-NEXT: (Low: %Base2 High: (100 + %Base2))
+; CHECK-NEXT: (Low: %Base2 High: (100 + %Base2)(u nuw))
; CHECK-NEXT: Member: {%Base2,+,1}<nw><%for.body>
; CHECK-NEXT: Group [[GRP34]]:
-; CHECK-NEXT: (Low: %Base1 High: (100 + %Base1))
+; CHECK-NEXT: (Low: %Base1 High: (100 + %Base1)(u nuw))
; CHECK-NEXT: Member: {%Base1,+,1}<nw><%for.body>
; CHECK-EMPTY:
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -471,16 +471,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
; RECURSE-NEXT: %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP35]]:
-; RECURSE-NEXT: (Low: %Dest High: (100 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (100 + %Dest)(u nuw))
; RECURSE-NEXT: Member: {%Dest,+,1}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP36]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)(u nuw))
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP37]]:
-; RECURSE-NEXT: (Low: %Base2 High: (100 + %Base2))
+; RECURSE-NEXT: (Low: %Base2 High: (100 + %Base2)(u nuw))
; RECURSE-NEXT: Member: {%Base2,+,1}<nw><%for.body>
; RECURSE-NEXT: Group [[GRP38]]:
-; RECURSE-NEXT: (Low: %Base1 High: (100 + %Base1))
+; RECURSE-NEXT: (Low: %Base1 High: (100 + %Base1)(u nuw))
; RECURSE-NEXT: Member: {%Base1,+,1}<nw><%for.body>
; RECURSE-EMPTY:
; RECURSE-NEXT: Non vectorizable stores to invariant address were not found in loop.
@@ -532,16 +532,16 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
; CHECK-NEXT: %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP39]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)(u nuw))
; CHECK-NEXT: Member: {%Dest,+,4}<nw><%for.body>
; CHECK-NEXT: Group [[GRP40]]:
-; CHECK-NE...
[truncated]
|
8999663
to
3a3232d
Compare
7b4f33b
to
6104124
Compare
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
6104124
to
4b0c62b
Compare
Relax the NUW requirements for isKnownPredicateViaNoOverflow, if the second operand (Y) is a BinOp. The code only simplifies the condition if C1 < C2, so if the BinOp is NUW, it doesn't matter whether the first operand also has the NUW flag, as it cannot wrap if C1 < C2.
Use SCEVUse to add a NUW flag to the upper bound of an accessed pointer. We must already have proved that the pointers do not wrap, as otherwise we could not use them for runtime check computations. By adding the use-specific NUW flag, we can detect cases where SCEV can prove that the compared pointers must overlap, so the runtime checks will always be false. In that case, there is no point in vectorizing with runtime checks. Note that this depends c2895cd27fbf200d1da056bc66d77eeb62690bf0, which could be submitted separately if desired; without the current change, I don't think it triggers in practice though.
4b0c62b
to
352daca
Compare
45b5e62
to
a777a0e
Compare
bcc33a2
to
a15daba
Compare
f3905cd
to
6eeb7f7
Compare
Use SCEVUse to add a NUW flag to the upper bound of an accessed pointer.
We must already have proved that the pointers do not wrap, as otherwise
we could not use them for runtime check computations.
By adding the use-specific NUW flag, we can detect cases where SCEV can
prove that the compared pointers must overlap, so the runtime checks
will always be false. In that case, there is no point in vectorizing
with runtime checks.
Note that this depends c2895cd27fbf200d1da056bc66d77eeb62690bf0, which
could be submitted separately if desired; without the current change, I
don't think it triggers in practice though.
Depends on #91961