Skip to content

[SCEV,LAA] Introduce scoped SCEV, use in LAA computations (WIP). #90742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented May 1, 2024

Note that this patch at the moment is mostly an experiment, and I'd
really appreciate any early feedback.

The motivating use case for this is better reasoning about pointer
bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions
that only need to be valid in the scope of the loop and we know that the
address computations in the loop don't wrap. When we compute the SCEV
for a pointer in the last iteration, this frequently boils down to
Base + Offset, but the resulting SCEV at the moment cannot be marked as
, as the flags for the SCEV expression must be valid in the whole
function.

A concrete example is @test_distance_positive_backwards, where I'd like
to prove that the 2 pointer ranges overlap. (the current code structure
in LAA isn't necessary ideal/finalized, I am mostly looking for feedback
on the SCEV side for now)

I'd like to provide a way to create scoped SCEV expressions, which we
will only use the reason in the context of a loop.

This patch introduces a way to create SCEV commutative expressions that
are only valid for a loop scope. This is done by adding a ExprScope
member to SCEV which is used as additional pointer ID in the lookup
value of UniqueSCEVs. This should ensure that we only return 'scoped'
SCEVs, if the scope is set.

The idea is to keep scoped SCEVs separate from regular SCEVs, as in if
no scope is set, no returned SCEV expression can reference any scoped
expressions (can add assert to that effect I think). This should ensure
that regular SCEVs cannot be 'polluted' by information from scoped
SCEVs. Added some test cases to make sure the modified scoped SCEVs do
not interfere with already cached or later constructed SCEVs in
llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

It's very likely that I am missing something here. In that case, any
suggestions on how to approach the problem differently?

fhahn added 2 commits May 1, 2024 16:53
Note that this patch at the moment is mostly an experiment, and I'd
really appreciate any early feedback.

The motivating use case for this is better reasoning about pointer
bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions
that only need to be valid in the scope of the loop and we know that the
address computations in the loop don't wrap. When we compute the SCEV
for a pointer in the last iteration, this frequently boils down to
Base + Offset, but the resulting SCEV at the moment cannot be marked as
<nuw>, as the flags for the SCEV expression must be valid in the whole
function.

A concrete example is @test_distance_positive_backwards, where I'd like
to prove that the 2 pointer ranges overlap. (the current code structure
in LAA isn't necessary ideal/finalized, I am mostly looking for feedback
on the SCEV side for now)

I'd like to provide a  way to create scoped SCEV expressions, which we
will only use the reason in the context of a loop.

This patch introduces a way to create SCEV commutative expressions that
are only valid for a loop scope. This is done by adding a ExprScope
member to SCEV which is used as additional pointer ID in the lookup
value of UniqueSCEVs. This should ensure that we only return 'scoped'
SCEVs, if the scope is set.

The idea is to keep scoped SCEVs separate from regular SCEVs, as in if
no scope is set, no returned SCEV expression can reference any scoped
expressions (can add assert to that effect I think). This should ensure
that regular SCEVs cannot be 'polluted' by information from scoped
SCEVs. Added some test cases to make sure the modified scoped SCEVs do
not interfere with already cached or later constructed SCEVs in
llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

It's very likely that I am missing something here. In that case, any
suggestions on how to approach the problem differently?
@fhahn fhahn requested a review from nikic as a code owner May 1, 2024 15:55
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label May 1, 2024
@llvmbot
Copy link
Member

llvmbot commented May 1, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Note that this patch at the moment is mostly an experiment, and I'd
really appreciate any early feedback.

The motivating use case for this is better reasoning about pointer
bounds in LoopAccessAnalysis. LAA creates a number of SCEV expressions
that only need to be valid in the scope of the loop and we know that the
address computations in the loop don't wrap. When we compute the SCEV
for a pointer in the last iteration, this frequently boils down to
Base + Offset, but the resulting SCEV at the moment cannot be marked as
<nuw>, as the flags for the SCEV expression must be valid in the whole
function.

A concrete example is @test_distance_positive_backwards, where I'd like
to prove that the 2 pointer ranges overlap. (the current code structure
in LAA isn't necessary ideal/finalized, I am mostly looking for feedback
on the SCEV side for now)

I'd like to provide a way to create scoped SCEV expressions, which we
will only use the reason in the context of a loop.

This patch introduces a way to create SCEV commutative expressions that
are only valid for a loop scope. This is done by adding a ExprScope
member to SCEV which is used as additional pointer ID in the lookup
value of UniqueSCEVs. This should ensure that we only return 'scoped'
SCEVs, if the scope is set.

The idea is to keep scoped SCEVs separate from regular SCEVs, as in if
no scope is set, no returned SCEV expression can reference any scoped
expressions (can add assert to that effect I think). This should ensure
that regular SCEVs cannot be 'polluted' by information from scoped
SCEVs. Added some test cases to make sure the modified scoped SCEVs do
not interfere with already cached or later constructed SCEVs in
llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

It's very likely that I am missing something here. In that case, any
suggestions on how to approach the problem differently?


Patch is 41.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90742.diff

13 Files Affected:

  • (modified) llvm/include/llvm/Analysis/LoopAccessAnalysis.h (+4)
  • (modified) llvm/include/llvm/Analysis/ScalarEvolution.h (+14)
  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+20-2)
  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+28-1)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll (+30-30)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/loops-with-indirect-reads-and-writes.ll (+2-2)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/memcheck-off-by-one-error.ll (+2-2)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/memcheck-store-vs-alloc-size.ll (+2-2)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll (+4-4)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/pointer-phis.ll (+22-10)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/positive-dependence-distance-different-access-sizes.ll (+8-19)
  • (modified) llvm/test/Analysis/LoopAccessAnalysis/reverse-memcheck-bounds.ll (+1-1)
  • (added) llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll (+189)
diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index e39c371b41ec5c..ca01db664207ff 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -435,8 +435,10 @@ class RuntimePointerChecking {
   /// Reset the state of the pointer runtime information.
   void reset() {
     Need = false;
+    AlwaysFalse = false;
     Pointers.clear();
     Checks.clear();
+    CheckingGroups.clear();
   }
 
   /// Insert a pointer and calculate the start and end SCEVs.
@@ -493,6 +495,8 @@ class RuntimePointerChecking {
   /// This flag indicates if we need to add the runtime check.
   bool Need = false;
 
+  bool AlwaysFalse = false;
+
   /// Information about the pointers that may require checking.
   SmallVector<PointerInfo, 2> Pointers;
 
diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 5828cc156cc785..7e5fbb8287e14c 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -1346,6 +1346,12 @@ class ScalarEvolution {
     }
   };
 
+  void setExprScope(const Loop *L);
+
+  void clearExprScope();
+
+  bool isScopedExpr(const SCEV *S);
+
 private:
   /// A CallbackVH to arrange for ScalarEvolution to be notified whenever a
   /// Value is deleted.
@@ -1435,6 +1441,14 @@ class ScalarEvolution {
   /// Memoized values for the getConstantMultiple
   DenseMap<const SCEV *, APInt> ConstantMultipleCache;
 
+  /// When not nullptr, this indicates the scope for which an expression needs
+  /// to be valid for. This allows creation of SCEV expressions that only need
+  /// to be valid in a specific loop, allowing to use more specific no-wrap
+  /// flags.
+  const Loop *ExprScope = nullptr;
+
+  SmallVector<const SCEV *> ScopedExprs;
+
   /// Return the Value set from which the SCEV expr is generated.
   ArrayRef<Value *> getSCEVValues(const SCEV *S);
 
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index b0d29e2409f762..4126bc092d4cb3 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -17,6 +17,7 @@
 #include "llvm/ADT/EquivalenceClasses.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/SmallSet.h"
@@ -208,7 +209,8 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
                                     PredicatedScalarEvolution &PSE,
                                     bool NeedsFreeze) {
   ScalarEvolution *SE = PSE.getSE();
-
+  SE->setExprScope(Lp);
+  auto ClearOnExit = make_scope_exit([SE]() { SE->clearExprScope(); });
   const SCEV *ScStart;
   const SCEV *ScEnd;
 
@@ -222,6 +224,10 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
     ScStart = AR->getStart();
     ScEnd = AR->evaluateAtIteration(Ex, *SE);
     const SCEV *Step = AR->getStepRecurrence(*SE);
+    if (AR->getNoWrapFlags(SCEV::FlagNUW) && SE->isScopedExpr(ScEnd)) {
+      if (auto *Comm = dyn_cast<SCEVCommutativeExpr>(ScEnd))
+        const_cast<SCEVCommutativeExpr *>(Comm)->setNoWrapFlags(SCEV::FlagNUW);
+    }
 
     // For expressions with negative step, the upper bound is ScStart and the
     // lower bound is ScEnd.
@@ -243,7 +249,13 @@ void RuntimePointerChecking::insert(Loop *Lp, Value *Ptr, const SCEV *PtrExpr,
   auto &DL = Lp->getHeader()->getModule()->getDataLayout();
   Type *IdxTy = DL.getIndexType(Ptr->getType());
   const SCEV *EltSizeSCEV = SE->getStoreSizeOfExpr(IdxTy, AccessTy);
-  ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV);
+  // TODO: this computes one-past-the-end. ScEnd + EltSizeSCEV - 1 is the last
+  // accessed byte. Not entirely sure if one-past-the-end must also not wrap? If
+  // it does, could compute and use last accessed byte instead.
+  if (SE->isScopedExpr(ScEnd))
+    ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV, SCEV::FlagNUW);
+  else
+    ScEnd = SE->getAddExpr(ScEnd, EltSizeSCEV, SCEV::FlagNUW);
 
   Pointers.emplace_back(Ptr, ScStart, ScEnd, WritePtr, DepSetId, ASId, PtrExpr,
                         NeedsFreeze);
@@ -378,6 +390,11 @@ SmallVector<RuntimePointerCheck, 4> RuntimePointerChecking::generateChecks() {
       if (needsChecking(CGI, CGJ)) {
         tryToCreateDiffCheck(CGI, CGJ);
         Checks.push_back(std::make_pair(&CGI, &CGJ));
+        if (SE->isKnownPredicate(CmpInst::ICMP_UGT, CGI.High, CGJ.Low) &&
+            SE->isKnownPredicate(CmpInst::ICMP_ULE, CGI.Low, CGJ.High)) {
+          AlwaysFalse = true;
+          return {};
+        }
       }
     }
   }
@@ -1273,6 +1290,7 @@ bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
   // If we can do run-time checks, but there are no checks, no runtime checks
   // are needed. This can happen when all pointers point to the same underlying
   // object for example.
+  CanDoRT &= !RtCheck.AlwaysFalse;
   RtCheck.Need = CanDoRT ? RtCheck.getNumberOfChecks() != 0 : MayNeedRTCheck;
 
   bool CanDoRTIfNeeded = !RtCheck.Need || CanDoRT;
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 93f885c5d5ad8b..9916714b2854c6 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2981,6 +2981,26 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
   return getOrCreateAddExpr(Ops, ComputeFlags(Ops));
 }
 
+void ScalarEvolution::setExprScope(const Loop *L) {
+  assert(!ExprScope && "cannot overwrite existing expression scope");
+  ExprScope = L;
+}
+
+void ScalarEvolution::clearExprScope() { ExprScope = nullptr; }
+
+bool ScalarEvolution::isScopedExpr(const SCEV *S) {
+  if (!ExprScope || !isa<SCEVCommutativeExpr>(S))
+    return false;
+
+  FoldingSetNodeID ID;
+  ID.AddInteger(S->getSCEVType());
+  for (const SCEV *Op : S->operands())
+    ID.AddPointer(Op);
+  ID.AddPointer(ExprScope);
+  void *IP = nullptr;
+  return UniqueSCEVs.FindNodeOrInsertPos(ID, IP);
+}
+
 const SCEV *
 ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
                                     SCEV::NoWrapFlags Flags) {
@@ -2988,6 +3008,8 @@ ScalarEvolution::getOrCreateAddExpr(ArrayRef<const SCEV *> Ops,
   ID.AddInteger(scAddExpr);
   for (const SCEV *Op : Ops)
     ID.AddPointer(Op);
+  if (ExprScope)
+    ID.AddPointer(ExprScope);
   void *IP = nullptr;
   SCEVAddExpr *S =
       static_cast<SCEVAddExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -3034,6 +3056,8 @@ ScalarEvolution::getOrCreateMulExpr(ArrayRef<const SCEV *> Ops,
   ID.AddInteger(scMulExpr);
   for (const SCEV *Op : Ops)
     ID.AddPointer(Op);
+  if (ExprScope)
+    ID.AddPointer(ExprScope);
   void *IP = nullptr;
   SCEVMulExpr *S =
     static_cast<SCEVMulExpr *>(UniqueSCEVs.FindNodeOrInsertPos(ID, IP));
@@ -14746,12 +14770,15 @@ PredicatedScalarEvolution::PredicatedScalarEvolution(ScalarEvolution &SE,
 
 void ScalarEvolution::registerUser(const SCEV *User,
                                    ArrayRef<const SCEV *> Ops) {
-  for (const auto *Op : Ops)
+  for (const auto *Op : Ops) {
     // We do not expect that forgetting cached data for SCEVConstants will ever
     // open any prospects for sharpening or introduce any correctness issues,
     // so we don't bother storing their dependencies.
     if (!isa<SCEVConstant>(Op))
       SCEVUsers[Op].insert(User);
+    assert((ExprScope || !isScopedExpr(Op)) &&
+           "Non-scoped expression cannot have scoped operands!");
+  }
 }
 
 const SCEV *PredicatedScalarEvolution::getSCEV(Value *V) {
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
index cd388b4ee87f22..a61c6dcc7af4d7 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
@@ -24,7 +24,7 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
 ; CHECK-NEXT:          %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP1]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; CHECK-NEXT:        Group [[GRP2]]:
@@ -58,7 +58,7 @@ define void @forked_ptrs_simple(ptr nocapture readonly %Base1, ptr nocapture rea
 ; RECURSE-NEXT:          %select = select i1 %cmp, ptr %gep.1, ptr %gep.2
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP4]]:
-; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%loop>
 ; RECURSE-NEXT:        Group [[GRP5]]:
@@ -132,10 +132,10 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP7]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP8]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP9]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -171,10 +171,10 @@ define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture read
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds float, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP11]]:
-; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP12]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP13]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -232,10 +232,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP15]]:
-; CHECK-NEXT:          (Low: %Dest High: (800 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (800 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,8}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP16]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP17]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (800 + %Base2))
@@ -271,10 +271,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_64b(ptr nocapture
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds double, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP19]]:
-; RECURSE-NEXT:          (Low: %Dest High: (800 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (800 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,8}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP20]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP21]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (800 + %Base2))
@@ -332,10 +332,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP23]]:
-; CHECK-NEXT:          (Low: %Dest High: (399 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (399 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP24]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP25]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (399 + %Base2))
@@ -371,10 +371,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_23b(ptr nocapture
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds i23, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP27]]:
-; RECURSE-NEXT:          (Low: %Dest High: (399 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (399 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP28]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP29]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (399 + %Base2))
@@ -432,10 +432,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
 ; CHECK-NEXT:          %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP31]]:
-; CHECK-NEXT:          (Low: %Dest High: (100 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (100 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,1}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP32]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP33]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (100 + %Base2))
@@ -471,10 +471,10 @@ define dso_local void @forked_ptrs_different_base_same_offset_6b(ptr nocapture r
 ; RECURSE-NEXT:          %.sink.in = getelementptr inbounds i6, ptr %spec.select, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP35]]:
-; RECURSE-NEXT:          (Low: %Dest High: (100 + %Dest))
+; RECURSE-NEXT:          (Low: %Dest High: (100 + %Dest)<nuw>)
 ; RECURSE-NEXT:            Member: {%Dest,+,1}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP36]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP37]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (100 + %Base2))
@@ -535,7 +535,7 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
 ; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nw><%for.body>
 ; CHECK-NEXT:        Group [[GRP40]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP41]]:
 ; CHECK-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -574,7 +574,7 @@ define dso_local void @forked_ptrs_different_base_same_offset_possible_poison(pt
 ; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; RECURSE-NEXT:            Member: {%Dest,+,4}<nw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP44]]:
-; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-NEXT:        Group [[GRP45]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: (400 + %Base2))
@@ -696,10 +696,10 @@ define dso_local void @forked_ptrs_add_to_offset(ptr nocapture readonly %Base, p
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP47]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP48]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP49]]:
 ; CHECK-NEXT:          (Low: ((4 * %extra_offset) + %Base) High: (404 + (4 * %extra_offset) + %Base))
@@ -764,10 +764,10 @@ define dso_local void @forked_ptrs_sub_from_offset(ptr nocapture readonly %Base,
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP50]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP51]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP52]]:
 ; CHECK-NEXT:          (Low: ((-4 * %extra_offset) + %Base) High: (404 + (-4 * %extra_offset) + %Base))
@@ -832,10 +832,10 @@ define dso_local void @forked_ptrs_add_sub_offset(ptr nocapture readonly %Base,
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP53]]:
-; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest)<nuw>)
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP54]]:
-; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds)<nuw>)
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-NEXT:        Group [[GRP55]]:
 ; CHECK-NEXT:          (Low: ((4 * %to_add) + (-4 * %to_sub) + %Base) High: (404 + (4 * %to_add) + (-4 * %to_sub) + %Base))
@@ -1256,7 +1256,7 @@ define void @sc_add_expr_ice(ptr %Base1, ptr %Base2, i64 %N) {
 ; CHECK-NEXT:          %fptr = getelementptr inbounds double, ptr %Base2, i64 %sel
 ; CHECK-NEXT:      Grouped accesses:
 ; CHECK-NEXT:        Group [[GRP56]]:
-; CHECK-NEXT:          (Low: %Base1 High: (8 + %Base1))
+; CHECK-NEXT:          (Low: %Base1 High: (8 + %Base1)<nuw>)
 ; CHECK-NEXT:            Member: %Base1
 ; CHECK-NEXT:        Group [[GRP57]]:
 ; CHECK-NEXT:          (Low: %Base2 High: ((8 * %N) + %Base2))
@@ -1283,7 +1283,7 @@ define void @sc_add_expr_ice(ptr %Base1, ptr %Base2, i64 %N) {
 ; RECURSE-NEXT:          %fptr = getelementptr inbounds double, ptr %Base2, i64 %sel
 ; RECURSE-NEXT:      Grouped accesses:
 ; RECURSE-NEXT:        Group [[GRP58]]:
-; RECURSE-NEXT:          (Low: %Base1 High: (8 + %Base1))
+; RECURSE-NEXT:          (Low: %Base1 High: (8 + %Base1)<nuw>)
 ; RECURSE-NEXT:            Member: %Base1
 ; RECURSE-NEXT:        Group [[GRP59]]:
 ; RECURSE-NEXT:          (Low: %Base2 High: ((8 * %N) + %Base2))
@@ -1351,7 +1351,7 @@ define void @forked_ptrs_with_different_base(ptr nocapture readonly %Preds, ptr
 ; CHECK-NEXT:          (Low: %2 High: (63992 + %2))
 ; CHECK-NEXT:            Member: {%2,+,8}<nw><%for.body>
 ; CHECK-NEXT:        Group [[GRP61]]:
-; CHECK-NEXT:          (Low: %Preds High: (31996 + %Preds))
+; CHECK-NEXT:          (Low: %Preds High: (31996 + %Preds)<nuw>)
 ; CHECK-NEXT:            Me...
[truncated]

@efriedma-quic
Copy link
Collaborator

When I've been thinking about flags in SCEV, I've been mostly considering modifying the flag representation itself: currently, flags have to be valid for the whole function, but we could add another form of flags that are only valid for users of a specific SCEV*. The primary problem with that is that SCEV canonicalization becomes less powerful: you can have two "equal" expressions that don't have the same flags. To work around that, you'd need some fast way to get from a SCEV with user-flags to the corresponding SCEV without those flags. Plus, you have two sets of flags, so reasoning about them becomes a bit harder. I think this is all solvable, but it's a pretty large change.

The alternative here is interesting: attach a scope to certain SCEVs, then you can modify their flags however you want, and nothing else ever sees those expressions.

It seems a bit dangerous to me to make the "scope" global per SCEV: scoped SCEVs could accidentally leak into contexts where you don't intend to use them (like the trip counts of loops). I guess the opposite is also sort of dangerous, though: if you accidentally drop the scope, you can accidentally introduce flags onto unscoped SCEVs.

I'm not sure Loop* pointers reliably live long enough to avoid accidentally reusing a scope; maybe consider just using a counter.

@fhahn
Copy link
Contributor Author

fhahn commented May 8, 2024

It seems a bit dangerous to me to make the "scope" global per SCEV: scoped SCEVs could accidentally leak into contexts where you don't intend to use them (like the trip counts of loops). I guess the opposite is also sort of dangerous, though: if you accidentally drop the scope, you can accidentally introduce flags onto unscoped SCEVs.

Yes, I think the main ones are computing new trip counts when a scope is set or setNoWrapFlags() to strengthen wrap flags while building other SCEVs being used when a scope is set. I'll see if we can add asserts to guard against those cases.

@nikic
Copy link
Contributor

nikic commented May 9, 2024

The current handling of SCEV flags is a pretty general issue, and the approach used here will only be usable in limited circumstances -- ideally, we would want to be able to preserve flags when going from IR to SCEV, but I don't think that the scope-based approach here will enable that (or at least, not to a significant degree and not without complications).

The general approach I would consider is to replace our current const SCEV * pointers with SCEVUse, where the low bits are used to store no-wrap flags for a specific use of the expression.

@fhahn
Copy link
Contributor Author

fhahn commented May 10, 2024

The current handling of SCEV flags is a pretty general issue, and the approach used here will only be usable in limited circumstances -- ideally, we would want to be able to preserve flags when going from IR to SCEV, but I don't think that the scope-based approach here will enable that (or at least, not to a significant degree and not without complications).

Fair point, I also spent a bit more time to see if I would be feasible to prevent leakage but it proved quite tricky.

The general approach I would consider is to replace our current const SCEV * pointers with SCEVUse, where the low bits are used to store no-wrap flags for a specific use of the expression.

Yeah that should also cover my use case here. I'll try to sketch a prototype

@fhahn fhahn closed this May 10, 2024
@fhahn fhahn deleted the perf/scoped-scev-for-laa branch May 10, 2024 18:35
fhahn added a commit that referenced this pull request May 13, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
@fhahn
Copy link
Contributor Author

fhahn commented May 13, 2024

The general approach I would consider is to replace our current const SCEV * pointers with SCEVUse, where the low bits are used to store no-wrap flags for a specific use of the expression.

Yeah that should also cover my use case here. I'll try to sketch a prototype

Sketched in #91961 with 2 additional linked PRs to use SCEVUse. @nikic Is that along the lines of what you had in mind?

fhahn added a commit that referenced this pull request May 22, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit to fhahn/llvm-project that referenced this pull request May 22, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit that referenced this pull request May 31, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit to fhahn/llvm-project that referenced this pull request Jun 25, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit that referenced this pull request Aug 1, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit to fhahn/llvm-project that referenced this pull request Oct 4, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit that referenced this pull request Oct 9, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to #90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 25, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

!fixup use raw pointer (const SCEV * + lower bits) for AddPointer.

!fix formatting

!fixup fix build failures after rebase, address comments

!fixup isCanonical/getCanonical

Simp
fhahn added a commit to fhahn/llvm-project that referenced this pull request Dec 12, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

!fixup use raw pointer (const SCEV * + lower bits) for AddPointer.

!fix formatting

!fixup fix build failures after rebase, address comments

!fixup isCanonical/getCanonical

Simp
fhahn added a commit to fhahn/llvm-project that referenced this pull request Dec 13, 2024
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u

!fixup use raw pointer (const SCEV * + lower bits) for AddPointer.

!fix formatting

!fixup fix build failures after rebase, address comments

!fixup isCanonical/getCanonical

Simp
fhahn added a commit to fhahn/llvm-project that referenced this pull request Feb 22, 2025
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
fhahn added a commit to fhahn/llvm-project that referenced this pull request Feb 23, 2025
This patch introduces SCEVUse, which is a tagged pointer containing the
used const SCEV *, plus extra bits to store NUW/NSW flags that are only
valid at the specific use.

This was suggested by @nikic as an alternative
to llvm#90742.

This patch just updates most SCEV infrastructure to operate on SCEVUse
instead of const SCEV *. It does not introduce any code that makes use
of the use-specific flags yet which I'll share as follow-ups.

Note that this should be NFC, but currently there's at least one case
where it is not (turn-to-invariant.ll), which I'll investigate once we
agree on the overall direction.

This PR at the moment also contains a commit that updates various SCEV
clients to use `const SCEV *` instead of `const auto *`, to prepare for
this patch. This reduces the number of changes needed, as SCEVUse will
automatically convert to `const SCEV *`. This is a safe default, as it
just drops the use-specific flags for the expression (it will not drop
any use-specific flags for any of its operands though).

This probably
SCEVUse could probably also be used to address mis-compiles due to
equivalent AddRecs modulo flags result in an AddRec with incorrect flags
for some uses of some phis, e.g. the one
llvm#80430 attempted to fix

Compile-time impact:
stage1-O3: +0.06%
stage1-ReleaseThinLTO: +0.07%
stage1-ReleaseLTO-g: +0.07%
stage2-O3: +0.11%

https://llvm-compile-time-tracker.com/compare.php?from=ce055843e2be9643bd58764783a7bb69f6db8c9a&to=8c7f4e9e154ebc4862c4e2716cedc3c688352d7c&stat=instructions:u
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants