[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742

fhahn · 2024-05-01T15:55:08Z

Note that this patch at the moment is mostly an experiment, and I'd
really appreciate any early feedback.

The motivating use case for this is better reasoning about pointer
bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions
that only need to be valid in the scope of the loop and we know that the
address computations in the loop don't wrap. When we compute the SCEV
for a pointer in the last iteration, this frequently boils down to
Base + Offset, but the resulting SCEV at the moment cannot be marked as
, as the flags for the SCEV expression must be valid in the whole
function.

A concrete example is @test_distance_positive_backwards, where I'd like
to prove that the 2 pointer ranges overlap. (the current code structure
in LAA isn't necessary ideal/finalized, I am mostly looking for feedback
on the SCEV side for now)

I'd like to provide a way to create scoped SCEV expressions, which we
will only use the reason in the context of a loop.

This patch introduces a way to create SCEV commutative expressions that
are only valid for a loop scope. This is done by adding a ExprScope
member to SCEV which is used as additional pointer ID in the lookup
value of UniqueSCEVs. This should ensure that we only return 'scoped'
SCEVs, if the scope is set.

The idea is to keep scoped SCEVs separate from regular SCEVs, as in if
no scope is set, no returned SCEV expression can reference any scoped
expressions (can add assert to that effect I think). This should ensure
that regular SCEVs cannot be 'polluted' by information from scoped
SCEVs. Added some test cases to make sure the modified scoped SCEVs do
not interfere with already cached or later constructed SCEVs in
llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

It's very likely that I am missing something here. In that case, any
suggestions on how to approach the problem differently?

Note that this patch at the moment is mostly an experiment, and I'd really appreciate any early feedback. The motivating use case for this is better reasoning about pointer bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions that only need to be valid in the scope of the loop and we know that the address computations in the loop don't wrap. When we compute the SCEV for a pointer in the last iteration, this frequently boils down to Base + Offset, but the resulting SCEV at the moment cannot be marked as <nuw>, as the flags for the SCEV expression must be valid in the whole function. A concrete example is @test_distance_positive_backwards, where I'd like to prove that the 2 pointer ranges overlap. (the current code structure in LAA isn't necessary ideal/finalized, I am mostly looking for feedback on the SCEV side for now) I'd like to provide a way to create scoped SCEV expressions, which we will only use the reason in the context of a loop. This patch introduces a way to create SCEV commutative expressions that are only valid for a loop scope. This is done by adding a ExprScope member to SCEV which is used as additional pointer ID in the lookup value of UniqueSCEVs. This should ensure that we only return 'scoped' SCEVs, if the scope is set. The idea is to keep scoped SCEVs separate from regular SCEVs, as in if no scope is set, no returned SCEV expression can reference any scoped expressions (can add assert to that effect I think). This should ensure that regular SCEVs cannot be 'polluted' by information from scoped SCEVs. Added some test cases to make sure the modified scoped SCEVs do not interfere with already cached or later constructed SCEVs in llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll It's very likely that I am missing something here. In that case, any suggestions on how to approach the problem differently?

llvmbot · 2024-05-01T15:55:43Z

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Note that this patch at the moment is mostly an experiment, and I'd
really appreciate any early feedback.

The motivating use case for this is better reasoning about pointer
bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions
that only need to be valid in the scope of the loop and we know that the
address computations in the loop don't wrap. When we compute the SCEV
for a pointer in the last iteration, this frequently boils down to
Base + Offset, but the resulting SCEV at the moment cannot be marked as
<nuw>, as the flags for the SCEV expression must be valid in the whole
function.

A concrete example is @test_distance_positive_backwards, where I'd like
to prove that the 2 pointer ranges overlap. (the current code structure
in LAA isn't necessary ideal/finalized, I am mostly looking for feedback
on the SCEV side for now)

I'd like to provide a way to create scoped SCEV expressions, which we
will only use the reason in the context of a loop.

This patch introduces a way to create SCEV commutative expressions that
are only valid for a loop scope. This is done by adding a ExprScope
member to SCEV which is used as additional pointer ID in the lookup
value of UniqueSCEVs. This should ensure that we only return 'scoped'
SCEVs, if the scope is set.

The idea is to keep scoped SCEVs separate from regular SCEVs, as in if
no scope is set, no returned SCEV expression can reference any scoped
expressions (can add assert to that effect I think). This should ensure
that regular SCEVs cannot be 'polluted' by information from scoped
SCEVs. Added some test cases to make sure the modified scoped SCEVs do
not interfere with already cached or later constructed SCEVs in
llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

It's very likely that I am missing something here. In that case, any
suggestions on how to approach the problem differently?

Patch is 41.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90742.diff

13 Files Affected:

(modified) llvm/include/llvm/Analysis/LoopAccessAnalysis.h (+4)
(modified) llvm/include/llvm/Analysis/ScalarEvolution.h (+14)
(modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+20-2)
(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+28-1)
(modified) llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll (+30-30)
(modified) llvm/test/Analysis/LoopAccessAnalysis/loops-with-indirect-reads-and-writes.ll (+2-2)
(modified) llvm/test/Analysis/LoopAccessAnalysis/memcheck-off-by-one-error.ll (+2-2)
(modified) llvm/test/Analysis/LoopAccessAnalysis/memcheck-store-vs-alloc-size.ll (+2-2)
(modified) llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll (+4-4)
(modified) llvm/test/Analysis/LoopAccessAnalysis/pointer-phis.ll (+22-10)
(modified) llvm/test/Analysis/LoopAccessAnalysis/positive-dependence-distance-different-access-sizes.ll (+8-19)
(modified) llvm/test/Analysis/LoopAccessAnalysis/reverse-memcheck-bounds.ll (+1-1)
(added) llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll (+189)

diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index e39c371b41ec5c..ca01db664207ff 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -435,8 +435,10 @@ class RuntimePointerChecking {
   /// Reset the state of the pointer runtime information.
   void reset() {
     Need = false;
+    AlwaysFalse = false;
     Pointers.clear();
     Checks.clear();
+    CheckingGroups.clear();
   }
 
   /// Insert a pointer and calculate the start and end SCEVs.
@@ -493,6 +495,8 @@ class RuntimePointerChecking {
   /// This flag indicates if we need to add the runtime check.
   bool Need = false;
 
+  bool AlwaysFalse = false;
+
   /// Information about the pointers that may require checking.
   SmallVector<PointerInfo, 2> Pointers;
 
diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 5828cc156cc785..7e5fbb8287e14c 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -1346,6 +1346,12 @@ class ScalarEvolution {
     }
   };
 
+  void setExprScope(const Loop *L);
+
+  void clearExprScope();
+
+  bool isScopedExpr(const SCEV *S);
+
 private:
   /// A CallbackVH to arrange for ScalarEvolution to be notified whenever a
   /// Value is deleted.
@@ -1435,6 +1441,14 @@ class ScalarEvolution {
   /// Memoized values for the getConstantMultiple
   DenseMap<const SCEV *, APInt> ConstantMultipleCache;
 
+  /// When not nullptr, this indicates the scope for which an expression needs
+  /// to be valid for. This allows creation of SCEV expressions that only need
+  /// to be valid in a specific loop, allowing to use more specific no-wrap
+  /// flags.
+  const Loop *ExprScope = nullptr;
+
+  SmallVector<const SCEV *> ScopedExprs;
+
   /// Return the Value set from which the SCEV expr is generated.
   ArrayRef<Value *> getSCEVValues(const SCEV *S);
 
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index b0d29e2409f762..4126bc092d4cb3 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -17,6 +17,7 @@
 #include "llvm/ADT/EquivalenceClasses.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/SmallSet.h"
@@ -208,7 +209,8 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
                                     PredicatedScalarEvolution &PSE,
                                     bool NeedsFreeze) {
   ScalarEvolution *SE = PSE.getSE();
-
+  SE->setExprScope(Lp);
+  auto ClearOnExit = make_scope_exit([SE]() { SE->clearExprScope(); });
   const SCEV *ScStart;
   const SCEV *ScEnd;
 
@@ -222,6 +224,10 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
     ScStart = AR->getStart();
     ScEnd = AR->evaluateAtIteration(Ex, *SE);
     const SCEV *Step = AR->getStepRecurrence(*SE);
+    if (AR->getNoWrapFlags(SCEV::FlagNUW) && SE->isScopedExpr(ScEnd)) {
+      if (auto *Comm = dyn_cast<SCEVCommutativeExpr>(ScEnd))
+        const_cast<SCEVCommutativeExpr *>(Comm)->setNoWrapFlags(SCEV::FlagNUW);
+    }
 
     // For expressions with negative step, the upper bound is ScStart and the
     // lower bound is ScEnd.
@@ -243,7 +249,13 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
   auto &DL = Lp->getHeader()->getModule()->getDataLayout();
   Type *IdxTy = DL.getIndexType(Ptr->getType());
   const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
-  ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV);
+  // TODO: this computes one-past-the-end. ScEnd + EltSizeSCEV - 1 is the last
+  // accessed byte. Not entirely sure if one-past-the-end must also not wrap? If
+  // it does, could compute and use last accessed byte instead.
+  if (SE->isScopedExpr(ScEnd))
+    ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV, SCEV::FlagNUW);
+  else
+    ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV, SCEV::FlagNUW);
 
   Pointers.emplace_back(Ptr, ScStart, ScEnd, WritePtr, DepSetId, ASId, PtrExpr,
                         NeedsFreeze);
@@ -378,6 +390,11 @@ SmallVector<RuntimePointerCheck, 4> RuntimePointerChecking::generateChecks() {
       if (needsChecking(CGI, CGJ)) {
         tryToCreateDiffCheck(CGI, CGJ);
         Checks.push_back(std::make_pair(&CGI, &CGJ));
+        if (SE->isKnownPredicate(CmpInst::ICMP_UGT, CGI.High, CGJ.Low) &&
+            SE->isKnownPredicate(CmpInst::ICMP_ULE, CGI.Low, CGJ.High)) {
+          AlwaysFalse = true;
+          return {};
+        }
       }
     }
   }
@@ -1273,6 +1290,7 @@ bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
   // If we can do run-time checks, but there are no checks, no runtime checks
   // are needed. This can happen when all pointers point to the same underlying
   // object for example.
+  CanDoRT &= !RtCheck.AlwaysFalse;
   RtCheck.Need = CanDoRT ? RtCheck.getNumberOfChecks() != 0 : MayNeedRTCheck;
 
   bool CanDoRTIfNeeded = !RtCheck.Need || CanDoRT;
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 93f885c5d5ad8b..9916714b2854c6 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2981,6 +2981,26 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
   return getOrCreateAddExpr(Ops, ComputeFlags(Ops));
 }
 
+void ScalarEvolution::setExprScope(const Loop *L) {
+  assert(!ExprScope && "cannot overwrite existing expression scope");
+  ExprScope = L;
+}
+
+void ScalarEvolution::clearExprScope() { ExprScope = nullptr; }
+
+bool ScalarEvolution::isScopedExpr(const SCEV *S) {
+  if (!ExprScope || !isa<SCEVCommutativeExpr>(S))
+    return false;
+
+  FoldingSetNodeID ID;
+  ID.AddInteger(S->getSCEVType());
+  for (const SCEV *Op : S->operands())
+    ID.AddPointer(Op);
+  ID.AddPointer(ExprScope);
+  void *IP = nullptr;
+  return UniqueSCEVs.FindNodeOrInsertPos(ID, IP);
+}
+
 const SCEV *
 ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
                                     SCEV::NoWrapFlags Flags) {
@@ -2988,6 +3008,8 @@ ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
   ID.AddInteger(scAddExpr);
   for (const SCEV *Op : Ops)
     ID.AddPointer(Op);
+  if (ExprScope)
+    ID.AddPointer(ExprScope);
   void *IP = nullptr;
   SCEVAddExpr *S =
       static_cast<SCEVAddExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -3034,6 +3056,8 @@ ScalarEvolution::getOrCreateMulExpr(ArrayRef<const SCEV *> Ops,
   ID.AddInteger(scMulExpr);
   for (const SCEV *Op : Ops)
     ID.AddPointer(Op);
+  if (ExprScope)
+    ID.AddPointer(ExprScope);
   void *IP = nullptr;
   SCEVMulExpr *S =
     static_cast<SCEVMulExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -14746,12 +14770,15 @@ PredicatedScalarEvolution::PredicatedScalarEvolution(ScalarEvolution &SE,
 
 void ScalarEvolution::registerUser(const SCEV *User,
                                    ArrayRef<const SCEV *> Ops) {
-  for (const auto *Op : Ops)
+  for (const auto *Op : Ops) {
     // We do not expect that forgetting cached data for SCEVConstants will ever
     // open any prospects for sharpening or introduce any correctness issues,
     // so we don't bother storing their dependencies.
     if (!isa<SCEVConstant>(Op))
       SCEVUsers[Op].insert(User);
+    assert((ExprScope || !isScopedExpr(Op)) &&
+           "Non-scoped expression cannot have scoped operands!");
+  }
 }
 
 const SCEV *PredicatedScalarEvolution::getSCEV(Value *V) {
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
index cd388b4ee87f22..a61c6dcc7af4d7 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
@@ -24,7 +24,7 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
 ; CHECK-NEXT:          %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP1]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; CHECK-NEXT:        Group [[GRP2]]:
@@ -58,7 +58,7 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
 ; RECURSE-NEXT:          %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP4]]:
-; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; RECURSE-NEXT:        Group [[GRP5]]:
@@ -132,10 +132,10 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP7]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP8]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP9]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -171,10 +171,10 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP11]]:
-; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP12]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP13]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -232,10 +232,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP15]]:
-; CHECK-NEXT:          (Low: %Dest High: (800 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (800 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,8}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP16]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP17]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (800 + %Base2))
@@ -271,10 +271,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP19]]:
-; RECURSE-NEXT:          (Low: %Dest High: (800 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (800 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,8}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP20]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP21]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (800 + %Base2))
@@ -332,10 +332,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP23]]:
-; CHECK-NEXT:          (Low: %Dest High: (399 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (399 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP24]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP25]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (399 + %Base2))
@@ -371,10 +371,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP27]]:
-; RECURSE-NEXT:          (Low: %Dest High: (399 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (399 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP28]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP29]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (399 + %Base2))
@@ -432,10 +432,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP31]]:
-; CHECK-NEXT:          (Low: %Dest High: (100 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (100 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,1}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP32]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP33]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (100 + %Base2))
@@ -471,10 +471,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP35]]:
-; RECURSE-NEXT:          (Low: %Dest High: (100 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (100 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,1}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP36]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP37]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (100 + %Base2))
@@ -535,7 +535,7 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
 ; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nw><%for.body>
 ; CHECK-NEXT:        Group [[GRP40]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP41]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -574,7 +574,7 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
 ; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP44]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP45]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -696,10 +696,10 @@ define dso_local void @forked_ptrs_add_to_offset(ptr nocapture readonly %Base, p
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP47]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP48]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP49]]:
 ; CHECK-NEXT:          (Low: ((4 * %extra_offset) + %Base) High: (404 + (4 * %extra_offset) + %Base))
@@ -764,10 +764,10 @@ define dso_local void @forked_ptrs_sub_from_offset(ptr nocapture readonly %Base,
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP50]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP51]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP52]]:
 ; CHECK-NEXT:          (Low: ((-4 * %extra_offset) + %Base) High: (404 + (-4 * %extra_offset) + %Base))
@@ -832,10 +832,10 @@ define dso_local void @forked_ptrs_add_sub_offset(ptr nocapture readonly %Base,
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP53]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP54]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP55]]:
 ; CHECK-NEXT:          (Low: ((4 * %to_add) + (-4 * %to_sub) + %Base) High: (404 + (4 * %to_add) + (-4 * %to_sub) + %Base))
@@ -1256,7 +1256,7 @@ define void @sc_add_expr_ice(ptr %Base1, ptr %Base2, i64 %N) {
 ; CHECK-NEXT:          %fptr = getelementptr inbounds double, ptr %Base2, i64 %sel
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP56]]:
-; CHECK-NEXT:          (Low: %Base1 High: (8 + %Base1))
+; CHECK-NEXT:          (Low: %Base1 High: (8 + %Base1)<nuw>)
 ; CHECK-NEXT:            Member: %Base1
 ; CHECK-NEXT:        Group [[GRP57]]:
 ; CHECK-NEXT:          (Low: %Base2 High: ((8 * %N) + %Base2))
@@ -1283,7 +1283,7 @@ define void @sc_add_expr_ice(ptr %Base1, ptr %Base2, i64 %N) {
 ; RECURSE-NEXT:          %fptr = getelementptr inbounds double, ptr %Base2, i64 %sel
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP58]]:
-; RECURSE-NEXT:          (Low: %Base1 High: (8 + %Base1))
+; RECURSE-NEXT:          (Low: %Base1 High: (8 + %Base1)<nuw>)
 ; RECURSE-NEXT:            Member: %Base1
 ; RECURSE-NEXT:        Group [[GRP59]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: ((8 * %N) + %Base2))
@@ -1351,7 +1351,7 @@ define void @forked_ptrs_with_different_base(ptr nocapture readonly %Preds, ptr
 ; CHECK-NEXT:          (Low: %2 High: (63992 + %2))
 ; CHECK-NEXT:            Member: {%2,+,8}<nw><%for.body>
 ; CHECK-NEXT:        Group [[GRP61]]:
-; CHECK-NEXT:          (Low: %Preds High: (31996 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (31996 + %Preds)<nuw>)
 ; CHECK-NEXT:            Me...
[truncated]

efriedma-quic · 2024-05-06T04:57:18Z

When I've been thinking about flags in SCEV, I've been mostly considering modifying the flag representation itself: currently, flags have to be valid for the whole function, but we could add another form of flags that are only valid for users of a specific SCEV*. The primary problem with that is that SCEV canonicalization becomes less powerful: you can have two "equal" expressions that don't have the same flags. To work around that, you'd need some fast way to get from a SCEV with user-flags to the corresponding SCEV without those flags. Plus, you have two sets of flags, so reasoning about them becomes a bit harder. I think this is all solvable, but it's a pretty large change.

The alternative here is interesting: attach a scope to certain SCEVs, then you can modify their flags however you want, and nothing else ever sees those expressions.

It seems a bit dangerous to me to make the "scope" global per SCEV: scoped SCEVs could accidentally leak into contexts where you don't intend to use them (like the trip counts of loops). I guess the opposite is also sort of dangerous, though: if you accidentally drop the scope, you can accidentally introduce flags onto unscoped SCEVs.

I'm not sure Loop* pointers reliably live long enough to avoid accidentally reusing a scope; maybe consider just using a counter.

fhahn · 2024-05-08T20:18:28Z

It seems a bit dangerous to me to make the "scope" global per SCEV: scoped SCEVs could accidentally leak into contexts where you don't intend to use them (like the trip counts of loops). I guess the opposite is also sort of dangerous, though: if you accidentally drop the scope, you can accidentally introduce flags onto unscoped SCEVs.

Yes, I think the main ones are computing new trip counts when a scope is set or setNoWrapFlags() to strengthen wrap flags while building other SCEVs being used when a scope is set. I'll see if we can add asserts to guard against those cases.

nikic · 2024-05-09T03:24:09Z

The current handling of SCEV flags is a pretty general issue, and the approach used here will only be usable in limited circumstances -- ideally, we would want to be able to preserve flags when going from IR to SCEV, but I don't think that the scope-based approach here will enable that (or at least, not to a significant degree and not without complications).

The general approach I would consider is to replace our current const SCEV * pointers with SCEVUse, where the low bits are used to store no-wrap flags for a specific use of the expression.

fhahn · 2024-05-10T18:29:34Z

The current handling of SCEV flags is a pretty general issue, and the approach used here will only be usable in limited circumstances -- ideally, we would want to be able to preserve flags when going from IR to SCEV, but I don't think that the scope-based approach here will enable that (or at least, not to a significant degree and not without complications).

Fair point, I also spent a bit more time to see if I would be feasible to prevent leakage but it proved quite tricky.

The general approach I would consider is to replace our current const SCEV * pointers with SCEVUse, where the low bits are used to store no-wrap flags for a specific use of the expression.

Yeah that should also cover my use case here. I'll try to sketch a prototype

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

fhahn · 2024-05-13T13:41:21Z

The general approach I would consider is to replace our current const SCEV * pointers with SCEVUse, where the low bits are used to store no-wrap flags for a specific use of the expression.

Yeah that should also cover my use case here. I'll try to sketch a prototype

Sketched in #91961 with 2 additional linked PRs to use SCEVUse. @nikic Is that along the lines of what you had in mind?

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to #90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one #80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u !fixup use raw pointer (const SCEV * + lower bits) for AddPointer. !fix formatting !fixup fix build failures after rebase, address comments !fixup isCanonical/getCanonical Simp

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u !fixup use raw pointer (const SCEV * + lower bits) for AddPointer. !fix formatting !fixup fix build failures after rebase, address comments !fixup isCanonical/getCanonical Simp

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u !fixup use raw pointer (const SCEV * + lower bits) for AddPointer. !fix formatting !fixup fix build failures after rebase, address comments !fixup isCanonical/getCanonical Simp

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

@nikic

This patch introduces SCEVUse, which is a tagged pointer containing the used const SCEV *, plus extra bits to store NUW/NSW flags that are only valid at the specific use. This was suggested by @nikic as an alternative to llvm#90742. This patch just updates most SCEV infrastructure to operate on SCEVUse instead of const SCEV *. It does not introduce any code that makes use of the use-specific flags yet which I'll share as follow-ups. Note that this should be NFC, but currently there's at least one case where it is not (turn-to-invariant.ll), which I'll investigate once we agree on the overall direction. This PR at the moment also contains a commit that updates various SCEV clients to use `const SCEV *` instead of `const auto *`, to prepare for this patch. This reduces the number of changes needed, as SCEVUse will automatically convert to `const SCEV *`. This is a safe default, as it just drops the use-specific flags for the expression (it will not drop any use-specific flags for any of its operands though). This probably SCEVUse could probably also be used to address mis-compiles due to equivalent AddRecs modulo flags result in an AddRec with incorrect flags for some uses of some phis, e.g. the one llvm#80430 attempted to fix Compile-time impact: stage1-O3: +0.06% stage1-ReleaseThinLTO: +0.07% stage1-ReleaseLTO-g: +0.07% stage2-O3: +0.11% https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

fhahn added 2 commits May 1, 2024 16:53

[SCEV,LAA] Add tests to make sure scoped SCEVs don't impact other SCEVs.

b8cd939

fhahn requested review from preames, Meinersbur and efriedma-quic May 1, 2024 15:55

fhahn requested a review from nikic as a code owner May 1, 2024 15:55

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label May 1, 2024

fhahn closed this May 10, 2024

fhahn deleted the perf/scoped-scev-for-laa branch May 10, 2024 18:35

fhahn mentioned this pull request May 13, 2024

[SCEV] Introduce SCEVUse, use it instead of const SCEV * (NFCI) (WIP). #91961

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742

[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742

Uh oh!

fhahn commented May 1, 2024

Uh oh!

llvmbot commented May 1, 2024

Uh oh!

efriedma-quic commented May 6, 2024

Uh oh!

fhahn commented May 8, 2024

Uh oh!

nikic commented May 9, 2024

Uh oh!

fhahn commented May 10, 2024

Uh oh!

fhahn commented May 13, 2024

Uh oh!

Uh oh!

[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742

[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742

Uh oh!

Conversation

fhahn commented May 1, 2024

Uh oh!

llvmbot commented May 1, 2024

Uh oh!

efriedma-quic commented May 6, 2024

Uh oh!

fhahn commented May 8, 2024

Uh oh!

nikic commented May 9, 2024

Uh oh!

fhahn commented May 10, 2024

Uh oh!

fhahn commented May 13, 2024

Uh oh!

Uh oh!