-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note that this patch at the moment is mostly an experiment, and I'd really appreciate any early feedback. The motivating use case for this is better reasoning about pointer bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions that only need to be valid in the scope of the loop and we know that the address computations in the loop don't wrap. When we compute the SCEV for a pointer in the last iteration, this frequently boils down to Base + Offset, but the resulting SCEV at the moment cannot be marked as <nuw>, as the flags for the SCEV expression must be valid in the whole function. A concrete example is @test_distance_positive_backwards, where I'd like to prove that the 2 pointer ranges overlap. (the current code structure in LAA isn't necessary ideal/finalized, I am mostly looking for feedback on the SCEV side for now) I'd like to provide a way to create scoped SCEV expressions, which we will only use the reason in the context of a loop. This patch introduces a way to create SCEV commutative expressions that are only valid for a loop scope. This is done by adding a ExprScope member to SCEV which is used as additional pointer ID in the lookup value of UniqueSCEVs. This should ensure that we only return 'scoped' SCEVs, if the scope is set. The idea is to keep scoped SCEVs separate from regular SCEVs, as in if no scope is set, no returned SCEV expression can reference any scoped expressions (can add assert to that effect I think). This should ensure that regular SCEVs cannot be 'polluted' by information from scoped SCEVs. Added some test cases to make sure the modified scoped SCEVs do not interfere with already cached or later constructed SCEVs in llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll It's very likely that I am missing something here. In that case, any suggestions on how to approach the problem differently?
@llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesNote that this patch at the moment is mostly an experiment, and I'd The motivating use case for this is better reasoning about pointer A concrete example is @test_distance_positive_backwards, where I'd like I'd like to provide a way to create scoped SCEV expressions, which we This patch introduces a way to create SCEV commutative expressions that The idea is to keep scoped SCEVs separate from regular SCEVs, as in if It's very likely that I am missing something here. In that case, any Patch is 41.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90742.diff 13 Files Affected:
diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index e39c371b41ec5c..ca01db664207ff 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -435,8 +435,10 @@ class RuntimePointerChecking {
/// Reset the state of the pointer runtime information.
void reset() {
Need = false;
+ AlwaysFalse = false;
Pointers.clear();
Checks.clear();
+ CheckingGroups.clear();
}
/// Insert a pointer and calculate the start and end SCEVs.
@@ -493,6 +495,8 @@ class RuntimePointerChecking {
/// This flag indicates if we need to add the runtime check.
bool Need = false;
+ bool AlwaysFalse = false;
+
/// Information about the pointers that may require checking.
SmallVector<PointerInfo, 2> Pointers;
diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 5828cc156cc785..7e5fbb8287e14c 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -1346,6 +1346,12 @@ class ScalarEvolution {
}
};
+ void setExprScope(const Loop *L);
+
+ void clearExprScope();
+
+ bool isScopedExpr(const SCEV *S);
+
private:
/// A CallbackVH to arrange for ScalarEvolution to be notified whenever a
/// Value is deleted.
@@ -1435,6 +1441,14 @@ class ScalarEvolution {
/// Memoized values for the getConstantMultiple
DenseMap<const SCEV *, APInt> ConstantMultipleCache;
+ /// When not nullptr, this indicates the scope for which an expression needs
+ /// to be valid for. This allows creation of SCEV expressions that only need
+ /// to be valid in a specific loop, allowing to use more specific no-wrap
+ /// flags.
+ const Loop *ExprScope = nullptr;
+
+ SmallVector<const SCEV *> ScopedExprs;
+
/// Return the Value set from which the SCEV expr is generated.
ArrayRef<Value *> getSCEVValues(const SCEV *S);
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index b0d29e2409f762..4126bc092d4cb3 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -17,6 +17,7 @@
#include "llvm/ADT/EquivalenceClasses.h"
#include "llvm/ADT/PointerIntPair.h"
#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallSet.h"
@@ -208,7 +209,8 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
PredicatedScalarEvolution &PSE,
bool NeedsFreeze) {
ScalarEvolution *SE = PSE.getSE();
-
+ SE->setExprScope(Lp);
+ auto ClearOnExit = make_scope_exit([SE]() { SE->clearExprScope(); });
const SCEV *ScStart;
const SCEV *ScEnd;
@@ -222,6 +224,10 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
ScStart = AR->getStart();
ScEnd = AR->evaluateAtIteration(Ex, *SE);
const SCEV *Step = AR->getStepRecurrence(*SE);
+ if (AR->getNoWrapFlags(SCEV::FlagNUW) && SE->isScopedExpr(ScEnd)) {
+ if (auto *Comm = dyn_cast<SCEVCommutativeExpr>(ScEnd))
+ const_cast<SCEVCommutativeExpr *>(Comm)->setNoWrapFlags(SCEV::FlagNUW);
+ }
// For expressions with negative step, the upper bound is ScStart and the
// lower bound is ScEnd.
@@ -243,7 +249,13 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
auto &DL = Lp->getHeader()->getModule()->getDataLayout();
Type *IdxTy = DL.getIndexType(Ptr->getType());
const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
- ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV);
+ // TODO: this computes one-past-the-end. ScEnd + EltSizeSCEV - 1 is the last
+ // accessed byte. Not entirely sure if one-past-the-end must also not wrap? If
+ // it does, could compute and use last accessed byte instead.
+ if (SE->isScopedExpr(ScEnd))
+ ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV, SCEV::FlagNUW);
+ else
+ ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV, SCEV::FlagNUW);
Pointers.emplace_back(Ptr, ScStart, ScEnd, WritePtr, DepSetId, ASId, PtrExpr,
NeedsFreeze);
@@ -378,6 +390,11 @@ SmallVector<RuntimePointerCheck, 4> RuntimePointerChecking::generateChecks() {
if (needsChecking(CGI, CGJ)) {
tryToCreateDiffCheck(CGI, CGJ);
Checks.push_back(std::make_pair(&CGI, &CGJ));
+ if (SE->isKnownPredicate(CmpInst::ICMP_UGT, CGI.High, CGJ.Low) &&
+ SE->isKnownPredicate(CmpInst::ICMP_ULE, CGI.Low, CGJ.High)) {
+ AlwaysFalse = true;
+ return {};
+ }
}
}
}
@@ -1273,6 +1290,7 @@ bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
// If we can do run-time checks, but there are no checks, no runtime checks
// are needed. This can happen when all pointers point to the same underlying
// object for example.
+ CanDoRT &= !RtCheck.AlwaysFalse;
RtCheck.Need = CanDoRT ? RtCheck.getNumberOfChecks() != 0 : MayNeedRTCheck;
bool CanDoRTIfNeeded = !RtCheck.Need || CanDoRT;
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 93f885c5d5ad8b..9916714b2854c6 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2981,6 +2981,26 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
return getOrCreateAddExpr(Ops, ComputeFlags(Ops));
}
+void ScalarEvolution::setExprScope(const Loop *L) {
+ assert(!ExprScope && "cannot overwrite existing expression scope");
+ ExprScope = L;
+}
+
+void ScalarEvolution::clearExprScope() { ExprScope = nullptr; }
+
+bool ScalarEvolution::isScopedExpr(const SCEV *S) {
+ if (!ExprScope || !isa<SCEVCommutativeExpr>(S))
+ return false;
+
+ FoldingSetNodeID ID;
+ ID.AddInteger(S->getSCEVType());
+ for (const SCEV *Op : S->operands())
+ ID.AddPointer(Op);
+ ID.AddPointer(ExprScope);
+ void *IP = nullptr;
+ return UniqueSCEVs.FindNodeOrInsertPos(ID, IP);
+}
+
const SCEV *
ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
SCEV::NoWrapFlags Flags) {
@@ -2988,6 +3008,8 @@ ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
ID.AddInteger(scAddExpr);
for (const SCEV *Op : Ops)
ID.AddPointer(Op);
+ if (ExprScope)
+ ID.AddPointer(ExprScope);
void *IP = nullptr;
SCEVAddExpr *S =
static_cast<SCEVAddExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -3034,6 +3056,8 @@ ScalarEvolution::getOrCreateMulExpr(ArrayRef<const SCEV *> Ops,
ID.AddInteger(scMulExpr);
for (const SCEV *Op : Ops)
ID.AddPointer(Op);
+ if (ExprScope)
+ ID.AddPointer(ExprScope);
void *IP = nullptr;
SCEVMulExpr *S =
static_cast<SCEVMulExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -14746,12 +14770,15 @@ PredicatedScalarEvolution::PredicatedScalarEvolution(ScalarEvolution &SE,
void ScalarEvolution::registerUser(const SCEV *User,
ArrayRef<const SCEV *> Ops) {
- for (const auto *Op : Ops)
+ for (const auto *Op : Ops) {
// We do not expect that forgetting cached data for SCEVConstants will ever
// open any prospects for sharpening or introduce any correctness issues,
// so we don't bother storing their dependencies.
if (!isa<SCEVConstant>(Op))
SCEVUsers[Op].insert(User);
+ assert((ExprScope || !isScopedExpr(Op)) &&
+ "Non-scoped expression cannot have scoped operands!");
+ }
}
const SCEV *PredicatedScalarEvolution::getSCEV(Value *V) {
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
index cd388b4ee87f22..a61c6dcc7af4d7 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
@@ -24,7 +24,7 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP1]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; CHECK-NEXT: Group [[GRP2]]:
@@ -58,7 +58,7 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
; RECURSE-NEXT: %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP4]]:
-; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%loop>
; RECURSE-NEXT: Group [[GRP5]]:
@@ -132,10 +132,10 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
; CHECK-NEXT: %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP7]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP8]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP9]]:
; CHECK-NEXT: (Low: %Base2 High: (400 + %Base2))
@@ -171,10 +171,10 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
; RECURSE-NEXT: %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP11]]:
-; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP12]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP13]]:
; RECURSE-NEXT: (Low: %Base2 High: (400 + %Base2))
@@ -232,10 +232,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
; CHECK-NEXT: %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP15]]:
-; CHECK-NEXT: (Low: %Dest High: (800 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (800 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,8}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP16]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP17]]:
; CHECK-NEXT: (Low: %Base2 High: (800 + %Base2))
@@ -271,10 +271,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
; RECURSE-NEXT: %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP19]]:
-; RECURSE-NEXT: (Low: %Dest High: (800 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (800 + %Dest)<nuw>)
; RECURSE-NEXT: Member: {%Dest,+,8}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP20]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP21]]:
; RECURSE-NEXT: (Low: %Base2 High: (800 + %Base2))
@@ -332,10 +332,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
; CHECK-NEXT: %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP23]]:
-; CHECK-NEXT: (Low: %Dest High: (399 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (399 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP24]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP25]]:
; CHECK-NEXT: (Low: %Base2 High: (399 + %Base2))
@@ -371,10 +371,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
; RECURSE-NEXT: %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP27]]:
-; RECURSE-NEXT: (Low: %Dest High: (399 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (399 + %Dest)<nuw>)
; RECURSE-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP28]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP29]]:
; RECURSE-NEXT: (Low: %Base2 High: (399 + %Base2))
@@ -432,10 +432,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
; CHECK-NEXT: %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP31]]:
-; CHECK-NEXT: (Low: %Dest High: (100 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (100 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,1}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP32]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP33]]:
; CHECK-NEXT: (Low: %Base2 High: (100 + %Base2))
@@ -471,10 +471,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
; RECURSE-NEXT: %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP35]]:
-; RECURSE-NEXT: (Low: %Dest High: (100 + %Dest))
+; RECURSE-NEXT: (Low: %Dest High: (100 + %Dest)<nuw>)
; RECURSE-NEXT: Member: {%Dest,+,1}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP36]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP37]]:
; RECURSE-NEXT: (Low: %Base2 High: (100 + %Base2))
@@ -535,7 +535,7 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
; CHECK-NEXT: Member: {%Dest,+,4}<nw><%for.body>
; CHECK-NEXT: Group [[GRP40]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP41]]:
; CHECK-NEXT: (Low: %Base2 High: (400 + %Base2))
@@ -574,7 +574,7 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
; RECURSE-NEXT: (Low: %Dest High: (400 + %Dest))
; RECURSE-NEXT: Member: {%Dest,+,4}<nw><%for.body>
; RECURSE-NEXT: Group [[GRP44]]:
-; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; RECURSE-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; RECURSE-NEXT: Group [[GRP45]]:
; RECURSE-NEXT: (Low: %Base2 High: (400 + %Base2))
@@ -696,10 +696,10 @@ define dso_local void @forked_ptrs_add_to_offset(ptr nocapture readonly %Base, p
; CHECK-NEXT: %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP47]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP48]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP49]]:
; CHECK-NEXT: (Low: ((4 * %extra_offset) + %Base) High: (404 + (4 * %extra_offset) + %Base))
@@ -764,10 +764,10 @@ define dso_local void @forked_ptrs_sub_from_offset(ptr nocapture readonly %Base,
; CHECK-NEXT: %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP50]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP51]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP52]]:
; CHECK-NEXT: (Low: ((-4 * %extra_offset) + %Base) High: (404 + (-4 * %extra_offset) + %Base))
@@ -832,10 +832,10 @@ define dso_local void @forked_ptrs_add_sub_offset(ptr nocapture readonly %Base,
; CHECK-NEXT: %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP53]]:
-; CHECK-NEXT: (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT: (Low: %Dest High: (400 + %Dest)<nuw>)
; CHECK-NEXT: Member: {%Dest,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP54]]:
-; CHECK-NEXT: (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (400 + %Preds)<nuw>)
; CHECK-NEXT: Member: {%Preds,+,4}<nuw><%for.body>
; CHECK-NEXT: Group [[GRP55]]:
; CHECK-NEXT: (Low: ((4 * %to_add) + (-4 * %to_sub) + %Base) High: (404 + (4 * %to_add) + (-4 * %to_sub) + %Base))
@@ -1256,7 +1256,7 @@ define void @sc_add_expr_ice(ptr %Base1, ptr %Base2, i64 %N) {
; CHECK-NEXT: %fptr = getelementptr inbounds double, ptr %Base2, i64 %sel
; CHECK-NEXT: Grouped accesses:
; CHECK-NEXT: Group [[GRP56]]:
-; CHECK-NEXT: (Low: %Base1 High: (8 + %Base1))
+; CHECK-NEXT: (Low: %Base1 High: (8 + %Base1)<nuw>)
; CHECK-NEXT: Member: %Base1
; CHECK-NEXT: Group [[GRP57]]:
; CHECK-NEXT: (Low: %Base2 High: ((8 * %N) + %Base2))
@@ -1283,7 +1283,7 @@ define void @sc_add_expr_ice(ptr %Base1, ptr %Base2, i64 %N) {
; RECURSE-NEXT: %fptr = getelementptr inbounds double, ptr %Base2, i64 %sel
; RECURSE-NEXT: Grouped accesses:
; RECURSE-NEXT: Group [[GRP58]]:
-; RECURSE-NEXT: (Low: %Base1 High: (8 + %Base1))
+; RECURSE-NEXT: (Low: %Base1 High: (8 + %Base1)<nuw>)
; RECURSE-NEXT: Member: %Base1
; RECURSE-NEXT: Group [[GRP59]]:
; RECURSE-NEXT: (Low: %Base2 High: ((8 * %N) + %Base2))
@@ -1351,7 +1351,7 @@ define void @forked_ptrs_with_different_base(ptr nocapture readonly %Preds, ptr
; CHECK-NEXT: (Low: %2 High: (63992 + %2))
; CHECK-NEXT: Member: {%2,+,8}<nw><%for.body>
; CHECK-NEXT: Group [[GRP61]]:
-; CHECK-NEXT: (Low: %Preds High: (31996 + %Preds))
+; CHECK-NEXT: (Low: %Preds High: (31996 + %Preds)<nuw>)
; CHECK-NEXT: Me...
[truncated]
|
When I've been thinking about flags in SCEV, I've been mostly considering modifying the flag representation itself: currently, flags have to be valid for the whole function, but we could add another form of flags that are only valid for users of a specific SCEV*. The primary problem with that is that SCEV canonicalization becomes less powerful: you can have two "equal" expressions that don't have the same flags. To work around that, you'd need some fast way to get from a SCEV with user-flags to the corresponding SCEV without those flags. Plus, you have two sets of flags, so reasoning about them becomes a bit harder. I think this is all solvable, but it's a pretty large change. The alternative here is interesting: attach a scope to certain SCEVs, then you can modify their flags however you want, and nothing else ever sees those expressions. It seems a bit dangerous to me to make the "scope" global per SCEV: scoped SCEVs could accidentally leak into contexts where you don't intend to use them (like the trip counts of loops). I guess the opposite is also sort of dangerous, though: if you accidentally drop the scope, you can accidentally introduce flags onto unscoped SCEVs. I'm not sure Loop* pointers reliably live long enough to avoid accidentally reusing a scope; maybe consider just using a counter. |
Yes, I think the main ones are computing new trip counts when a scope is set or |
The current handling of SCEV flags is a pretty general issue, and the approach used here will only be usable in limited circumstances -- ideally, we would want to be able to preserve flags when going from IR to SCEV, but I don't think that the scope-based approach here will enable that (or at least, not to a significant degree and not without complications). The general approach I would consider is to replace our current |
Fair point, I also spent a bit more time to see if I would be feasible to prevent leakage but it proved quite tricky.
Yeah that should also cover my use case here. I'll try to sketch a prototype |
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
Sketched in #91961 with 2 additional linked PRs to use SCEVUse. @nikic Is that along the lines of what you had in mind? |
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u !fixup use raw pointer (const SCEV * + lower bits) for AddPointer. !fix formatting !fixup fix build failures after rebase, address comments !fixup isCanonical/getCanonical Simp
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u !fixup use raw pointer (const SCEV * + lower bits) for AddPointer. !fix formatting !fixup fix build failures after rebase, address comments !fixup isCanonical/getCanonical Simp
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u !fixup use raw pointer (const SCEV * + lower bits) for AddPointer. !fix formatting !fixup fix build failures after rebase, address comments !fixup isCanonical/getCanonical Simp
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
Note that this patch at the moment is mostly an experiment, and I'd
really appreciate any early feedback.
The motivating use case for this is better reasoning about pointer
bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions
that only need to be valid in the scope of the loop and we know that the
address computations in the loop don't wrap. When we compute the SCEV
for a pointer in the last iteration, this frequently boils down to
Base + Offset, but the resulting SCEV at the moment cannot be marked as
, as the flags for the SCEV expression must be valid in the whole
function.
A concrete example is @test_distance_positive_backwards, where I'd like
to prove that the 2 pointer ranges overlap. (the current code structure
in LAA isn't necessary ideal/finalized, I am mostly looking for feedback
on the SCEV side for now)
I'd like to provide a way to create scoped SCEV expressions, which we
will only use the reason in the context of a loop.
This patch introduces a way to create SCEV commutative expressions that
are only valid for a loop scope. This is done by adding a ExprScope
member to SCEV which is used as additional pointer ID in the lookup
value of UniqueSCEVs. This should ensure that we only return 'scoped'
SCEVs, if the scope is set.
The idea is to keep scoped SCEVs separate from regular SCEVs, as in if
no scope is set, no returned SCEV expression can reference any scoped
expressions (can add assert to that effect I think). This should ensure
that regular SCEVs cannot be 'polluted' by information from scoped
SCEVs. Added some test cases to make sure the modified scoped SCEVs do
not interfere with already cached or later constructed SCEVs in
llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
It's very likely that I am missing something here. In that case, any
suggestions on how to approach the problem differently?