Skip to content

[AST] Don't merge memory locations in AliasSetTracker #65731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jan 17, 2024

Conversation

brunodf-snps
Copy link
Contributor

@brunodf-snps brunodf-snps commented Sep 8, 2023

This changes the AliasSetTracker to track memory locations instead of pointers in its alias sets. The motivation for this is outlined in an RFC posted on LLVM discourse.

This commit does not yet change method names and code comments to refer to "memory location(s)" instead of "pointer(s)", to keep the changeset focused, and to first gain feedback on the idea. An additional commit can take care of this.

In the data structures of the AST implementation, I made the choice to replace the linked list of PointerRec entries (that had to go anyway) with a simple flat vector of MemoryLocation objects, but for the AliasSet objects referenced from a lookup table, I retained the mechanism of a linked list, reference counting, forwarding, etc. The data structures could be revised in a follow-up change.

@brunodf-snps brunodf-snps requested review from a team as code owners September 8, 2023 09:23
@brunodf-snps brunodf-snps marked this pull request as draft September 8, 2023 09:25
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly it makes more sense to retrieve a MemoryAccess through a MemoryLocation rather than a pointer.

@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Sep 12, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2023

@llvm/pr-subscribers-llvm-transforms

Changes

This changes the AliasSetTracker to track memory locations instead of pointers in its alias sets. The motivation for this is outlined in an RFC posted on LLVM discourse.

This commit does not yet change method names and code comments to refer to "memory location(s)" instead of "pointer(s)", to keep the changeset focused, and to first gain feedback on the idea. An additional commit can take care of this.

In the data structures of the AST implementation, I made the choice to replace the linked list of PointerRec entries (that had to go anyway) with a simple flat vector of MemoryLocation objects, but for the AliasSet objects referenced from a lookup table, I retained the mechanism of a linked list, reference counting, forwarding, etc. The data structures could be revised in a follow-up change.

Patch is 114.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65731.diff

16 Files Affected:

  • (modified) llvm/include/llvm/Analysis/AliasSetTracker.h (+17-161)
  • (modified) llvm/include/llvm/LinkAllPasses.h (+1-2)
  • (modified) llvm/lib/Analysis/AliasSetTracker.cpp (+81-159)
  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+5-5)
  • (modified) llvm/lib/Transforms/Scalar/LICM.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Scalar/LoopVersioningLICM.cpp (+3-3)
  • (modified) llvm/test/Analysis/AliasSet/argmemonly.ll (+16-16)
  • (modified) llvm/test/Analysis/AliasSet/guards.ll (+108-108)
  • (modified) llvm/test/Analysis/AliasSet/intrinsics.ll (+5-5)
  • (modified) llvm/test/Analysis/AliasSet/memset.ll (+4-4)
  • (modified) llvm/test/Analysis/AliasSet/memtransfer.ll (+14-14)
  • (modified) llvm/test/Analysis/AliasSet/saturation.ll (+15-17)
  • (added) llvm/test/Transforms/LICM/variant-aainfo.ll (+65)
  • (modified) polly/lib/Analysis/ScopBuilder.cpp (+4-2)
  • (modified) polly/lib/Analysis/ScopDetection.cpp (+2-2)
  • (modified) polly/lib/Analysis/ScopDetectionDiagnostic.cpp (+2-1)
diff --git a/llvm/include/llvm/Analysis/AliasSetTracker.h b/llvm/include/llvm/Analysis/AliasSetTracker.h
index e485e1ff2f4c987..fe4e6e7e7acba91 100644
--- a/llvm/include/llvm/Analysis/AliasSetTracker.h
+++ b/llvm/include/llvm/Analysis/AliasSetTracker.h
@@ -49,99 +49,12 @@ class Value;
 class AliasSet : public ilist_node {
   friend class AliasSetTracker;
 
-  class PointerRec {
-    Value *Val;  // The pointer this record corresponds to.
-    PointerRec **PrevInList = nullptr;
-    PointerRec *NextInList = nullptr;
-    AliasSet *AS = nullptr;
-    LocationSize Size = LocationSize::mapEmpty();
-    AAMDNodes AAInfo;
-
-    // Whether the size for this record has been set at all. This makes no
-    // guarantees about the size being known.
-    bool isSizeSet() const { return Size != LocationSize::mapEmpty(); }
-
-  public:
-    PointerRec(Value *V)
-      : Val(V), AAInfo(DenseMapInfo::getEmptyKey()) {}
-
-    Value *getValue() const { return Val; }
-
-    PointerRec *getNext() const { return NextInList; }
-    bool hasAliasSet() const { return AS != nullptr; }
-
-    PointerRec** setPrevInList(PointerRec **PIL) {
-      PrevInList = PIL;
-      return &NextInList;
-    }
-
-    bool updateSizeAndAAInfo(LocationSize NewSize, const AAMDNodes &NewAAInfo) {
-      bool SizeChanged = false;
-      if (NewSize != Size) {
-        LocationSize OldSize = Size;
-        Size = isSizeSet() ? Size.unionWith(NewSize) : NewSize;
-        SizeChanged = OldSize != Size;
-      }
-
-      if (AAInfo == DenseMapInfo::getEmptyKey())
-        // We don't have a AAInfo yet. Set it to NewAAInfo.
-        AAInfo = NewAAInfo;
-      else {
-        AAMDNodes Intersection(AAInfo.intersect(NewAAInfo));
-        SizeChanged |= Intersection != AAInfo;
-        AAInfo = Intersection;
-      }
-      return SizeChanged;
-    }
-
-    LocationSize getSize() const {
-      assert(isSizeSet() && "Getting an unset size!");
-      return Size;
-    }
-
-    /// Return the AAInfo, or null if there is no information or conflicting
-    /// information.
-    AAMDNodes getAAInfo() const {
-      // If we have missing or conflicting AAInfo, return null.
-      if (AAInfo == DenseMapInfo::getEmptyKey() ||
-          AAInfo == DenseMapInfo::getTombstoneKey())
-        return AAMDNodes();
-      return AAInfo;
-    }
-
-    AliasSet *getAliasSet(AliasSetTracker &AST) {
-      assert(AS && "No AliasSet yet!");
-      if (AS->Forward) {
-        AliasSet *OldAS = AS;
-        AS = OldAS->getForwardedTarget(AST);
-        AS->addRef();
-        OldAS->dropRef(AST);
-      }
-      return AS;
-    }
-
-    void setAliasSet(AliasSet *as) {
-      assert(!AS && "Already have an alias set!");
-      AS = as;
-    }
-
-    void eraseFromList() {
-      if (NextInList) NextInList->PrevInList = PrevInList;
-      *PrevInList = NextInList;
-      if (AS->PtrListEnd == &NextInList) {
-        AS->PtrListEnd = PrevInList;
-        assert(*AS->PtrListEnd == nullptr && "List not terminated right!");
-      }
-      delete this;
-    }
-  };
-
-  // Doubly linked list of nodes.
-  PointerRec *PtrList = nullptr;
-  PointerRec **PtrListEnd;
   // Forwarding pointer.
   AliasSet *Forward = nullptr;
 
+  /// Memory locations in this alias set.
+  std::vector MemoryLocs;
+
   /// All instructions without a specific address in this alias set.
   std::vector> UnknownInsts;
 
@@ -178,8 +91,6 @@ class AliasSet : public ilist_node {
   };
   unsigned Alias : 1;
 
-  unsigned SetSize = 0;
-
   void addRef() { ++RefCount; }
 
   void dropRef(AliasSetTracker &AST) {
@@ -207,66 +118,21 @@ class AliasSet : public ilist_node {
 
   // Alias Set iteration - Allow access to all of the pointers which are part of
   // this alias set.
-  class iterator;
-  iterator begin() const { return iterator(PtrList); }
-  iterator end()   const { return iterator(); }
-  bool empty() const { return PtrList == nullptr; }
+  using iterator = std::vector::const_iterator;
+  iterator begin() const { return MemoryLocs.begin(); }
+  iterator end()   const { return MemoryLocs.end(); }
 
-  // Unfortunately, ilist::size() is linear, so we have to add code to keep
-  // track of the list's exact size.
-  unsigned size() { return SetSize; }
+  unsigned size() { return MemoryLocs.size(); }
 
   void print(raw_ostream &OS) const;
   void dump() const;
 
-  /// Define an iterator for alias sets... this is just a forward iterator.
-  class iterator {
-    PointerRec *CurNode;
-
-  public:
-    using iterator_category = std::forward_iterator_tag;
-    using value_type = PointerRec;
-    using difference_type = std::ptrdiff_t;
-    using pointer = value_type *;
-    using reference = value_type &;
-
-    explicit iterator(PointerRec *CN = nullptr) : CurNode(CN) {}
-
-    bool operator==(const iterator& x) const {
-      return CurNode == x.CurNode;
-    }
-    bool operator!=(const iterator& x) const { return !operator==(x); }
-
-    value_type &operator*() const {
-      assert(CurNode && "Dereferencing AliasSet.end()!");
-      return *CurNode;
-    }
-    value_type *operator->() const { return &operator*(); }
-
-    Value *getPointer() const { return CurNode->getValue(); }
-    LocationSize getSize() const { return CurNode->getSize(); }
-    AAMDNodes getAAInfo() const { return CurNode->getAAInfo(); }
-
-    iterator& operator++() {                // Preincrement
-      assert(CurNode && "Advancing past AliasSet.end()!");
-      CurNode = CurNode->getNext();
-      return *this;
-    }
-    iterator operator++(int) { // Postincrement
-      iterator tmp = *this; ++*this; return tmp;
-    }
-  };
-
 private:
   // Can only be created by AliasSetTracker.
   AliasSet()
-      : PtrListEnd(&PtrList), RefCount(0),  AliasAny(false), Access(NoAccess),
+      : RefCount(0), AliasAny(false), Access(NoAccess),
         Alias(SetMustAlias) {}
 
-  PointerRec *getSomePointer() const {
-    return PtrList;
-  }
-
   /// Return the real alias set this represents. If this has been merged with
   /// another set and is forwarding, return the ultimate destination set. This
   /// also implements the union-find collapsing as well.
@@ -284,16 +150,16 @@ class AliasSet : public ilist_node {
 
   void removeFromTracker(AliasSetTracker &AST);
 
-  void addPointer(AliasSetTracker &AST, PointerRec &Entry, LocationSize Size,
-                  const AAMDNodes &AAInfo, bool KnownMustAlias = false,
-                  bool SkipSizeUpdate = false);
+  bool isMustAliasMergeWith(AliasSet &AS, BatchAAResults &BatchAA) const;
+  void addPointer(AliasSetTracker &AST, const MemoryLocation &MemLoc,
+                  bool KnownMustAlias = false);
   void addUnknownInst(Instruction *I, BatchAAResults &AA);
 
 public:
   /// If the specified pointer "may" (or must) alias one of the members in the
   /// set return the appropriate AliasResult. Otherwise return NoAlias.
-  AliasResult aliasesPointer(const Value *Ptr, LocationSize Size,
-                             const AAMDNodes &AAInfo, BatchAAResults &AA) const;
+  AliasResult aliasesPointer(const MemoryLocation &MemLoc, BatchAAResults &AA) const;
+
   ModRefInfo aliasesUnknownInst(const Instruction *Inst,
                                 BatchAAResults &AA) const;
 };
@@ -307,7 +173,7 @@ class AliasSetTracker {
   BatchAAResults &AA;
   ilist AliasSets;
 
-  using PointerMapType = DenseMap, AliasSet::PointerRec *>;
+  using PointerMapType = DenseMap;
 
   // Map from pointers to their node
   PointerMapType PointerMap;
@@ -330,7 +196,7 @@ class AliasSetTracker {
   /// These methods return true if inserting the instruction resulted in the
   /// addition of a new alias set (i.e., the pointer did not alias anything).
   ///
-  void add(Value *Ptr, LocationSize Size, const AAMDNodes &AAInfo); // Add a loc
+  void add(const MemoryLocation &Loc); // Add a loc
   void add(LoadInst *LI);
   void add(StoreInst *SI);
   void add(VAArgInst *VAAI);
@@ -371,7 +237,7 @@ class AliasSetTracker {
   friend class AliasSet;
 
   // The total number of pointers contained in all "may" alias sets.
-  unsigned TotalMayAliasSetSize = 0;
+  unsigned TotalAliasSetSize = 0;
 
   // A non-null value signifies this AST is saturated. A saturated AST lumps
   // all pointers into a single "May" set.
@@ -379,18 +245,8 @@ class AliasSetTracker {
 
   void removeAliasSet(AliasSet *AS);
 
-  /// Just like operator[] on the map, except that it creates an entry for the
-  /// pointer if it doesn't already exist.
-  AliasSet::PointerRec &getEntryFor(Value *V) {
-    AliasSet::PointerRec *&Entry = PointerMap[V];
-    if (!Entry)
-      Entry = new AliasSet::PointerRec(V);
-    return *Entry;
-  }
-
   AliasSet &addPointer(MemoryLocation Loc, AliasSet::AccessLattice E);
-  AliasSet *mergeAliasSetsForPointer(const Value *Ptr, LocationSize Size,
-                                     const AAMDNodes &AAInfo,
+  AliasSet *mergeAliasSetsForPointer(const MemoryLocation& MemLoc,
                                      bool &MustAliasAll);
 
   /// Merge all alias sets into a single set that is considered to alias any
diff --git a/llvm/include/llvm/LinkAllPasses.h b/llvm/include/llvm/LinkAllPasses.h
index 081a15077e7be4c..a4ec0b337129787 100644
--- a/llvm/include/llvm/LinkAllPasses.h
+++ b/llvm/include/llvm/LinkAllPasses.h
@@ -163,8 +163,7 @@ namespace {
       llvm::AliasAnalysis AA(TLI);
       llvm::BatchAAResults BAA(AA);
       llvm::AliasSetTracker X(BAA);
-      X.add(nullptr, llvm::LocationSize::beforeOrAfterPointer(),
-            llvm::AAMDNodes()); // for -print-alias-sets
+      X.add(llvm::MemoryLocation()); // for -print-alias-sets
       (void) llvm::AreStatisticsEnabled();
       (void) llvm::sys::RunningOnValgrind();
     }
diff --git a/llvm/lib/Analysis/AliasSetTracker.cpp b/llvm/lib/Analysis/AliasSetTracker.cpp
index 91b889116dfa2d3..f5554b193694539 100644
--- a/llvm/lib/Analysis/AliasSetTracker.cpp
+++ b/llvm/lib/Analysis/AliasSetTracker.cpp
@@ -40,36 +40,37 @@ static cl::opt
                         cl::desc("The maximum number of pointers may-alias "
                                  "sets may contain before degradation"));
 
+bool AliasSet::isMustAliasMergeWith(AliasSet &AS, BatchAAResults &BatchAA) const {
+  for (const MemoryLocation &MemLoc : MemoryLocs)
+    for (const MemoryLocation &ASMemLoc : AS.MemoryLocs)
+      if (!BatchAA.isMustAlias(MemLoc, ASMemLoc))
+        return false;
+  return true;
+}
+
 /// mergeSetIn - Merge the specified alias set into this alias set.
 void AliasSet::mergeSetIn(AliasSet &AS, AliasSetTracker &AST,
                           BatchAAResults &BatchAA) {
   assert(!AS.Forward && "Alias set is already forwarding!");
   assert(!Forward && "This set is a forwarding set!!");
 
-  bool WasMustAlias = (Alias == SetMustAlias);
   // Update the alias and access types of this set...
   Access |= AS.Access;
   Alias  |= AS.Alias;
 
   if (Alias == SetMustAlias) {
-    // Check that these two merged sets really are must aliases.  Since both
-    // used to be must-alias sets, we can just check any pointer from each set
-    // for aliasing.
-    PointerRec *L = getSomePointer();
-    PointerRec *R = AS.getSomePointer();
-
+    // Check that these two merged sets really are must aliases.
     // If the pointers are not a must-alias pair, this set becomes a may alias.
-    if (!BatchAA.isMustAlias(
-            MemoryLocation(L->getValue(), L->getSize(), L->getAAInfo()),
-            MemoryLocation(R->getValue(), R->getSize(), R->getAAInfo())))
+    if (!isMustAliasMergeWith(AS, BatchAA))
       Alias = SetMayAlias;
   }
 
-  if (Alias == SetMayAlias) {
-    if (WasMustAlias)
-      AST.TotalMayAliasSetSize += size();
-    if (AS.Alias == SetMustAlias)
-      AST.TotalMayAliasSetSize += AS.size();
+  // Merge the list of constituent pointers...
+  if (MemoryLocs.empty()) {
+    std::swap(MemoryLocs, AS.MemoryLocs);
+  } else {
+    llvm::append_range(MemoryLocs, AS.MemoryLocs);
+    AS.MemoryLocs.clear();
   }
 
   bool ASHadUnknownInsts = !AS.UnknownInsts.empty();
@@ -86,18 +87,6 @@ void AliasSet::mergeSetIn(AliasSet &AS, AliasSetTracker &AST,
   AS.Forward = this; // Forward across AS now...
   addRef();          // AS is now pointing to us...
 
-  // Merge the list of constituent pointers...
-  if (AS.PtrList) {
-    SetSize += AS.size();
-    AS.SetSize = 0;
-    *PtrListEnd = AS.PtrList;
-    AS.PtrList->setPrevInList(PtrListEnd);
-    PtrListEnd = AS.PtrListEnd;
-
-    AS.PtrList = nullptr;
-    AS.PtrListEnd = &AS.PtrList;
-    assert(*AS.PtrListEnd == nullptr && "End of list is not null?");
-  }
   if (ASHadUnknownInsts)
     AS.dropRef(AST);
 }
@@ -106,9 +95,8 @@ void AliasSetTracker::removeAliasSet(AliasSet *AS) {
   if (AliasSet *Fwd = AS->Forward) {
     Fwd->dropRef(*this);
     AS->Forward = nullptr;
-  } else // Update TotalMayAliasSetSize only if not forwarding.
-      if (AS->Alias == AliasSet::SetMayAlias)
-        TotalMayAliasSetSize -= AS->size();
+  } else // Update TotalAliasSetSize only if not forwarding.
+      TotalAliasSetSize -= AS->size();
 
   AliasSets.erase(AS);
   // If we've removed the saturated alias set, set saturated marker back to
@@ -124,42 +112,20 @@ void AliasSet::removeFromTracker(AliasSetTracker &AST) {
   AST.removeAliasSet(this);
 }
 
-void AliasSet::addPointer(AliasSetTracker &AST, PointerRec &Entry,
-                          LocationSize Size, const AAMDNodes &AAInfo,
-                          bool KnownMustAlias, bool SkipSizeUpdate) {
-  assert(!Entry.hasAliasSet() && "Entry already in set!");
-
+void AliasSet::addPointer(AliasSetTracker &AST, const MemoryLocation& MemLoc,
+                          bool KnownMustAlias) {
   // Check to see if we have to downgrade to _may_ alias.
-  if (isMustAlias())
-    if (PointerRec *P = getSomePointer()) {
-      if (!KnownMustAlias) {
-        BatchAAResults &AA = AST.getAliasAnalysis();
-        AliasResult Result = AA.alias(
-            MemoryLocation(P->getValue(), P->getSize(), P->getAAInfo()),
-            MemoryLocation(Entry.getValue(), Size, AAInfo));
-        if (Result != AliasResult::MustAlias) {
-          Alias = SetMayAlias;
-          AST.TotalMayAliasSetSize += size();
-        }
-        assert(Result != AliasResult::NoAlias && "Cannot be part of must set!");
-      } else if (!SkipSizeUpdate)
-        P->updateSizeAndAAInfo(Size, AAInfo);
-    }
-
-  Entry.setAliasSet(this);
-  Entry.updateSizeAndAAInfo(Size, AAInfo);
+  if (isMustAlias() && !KnownMustAlias)
+    for (const MemoryLocation& ASMemLoc : MemoryLocs)
+      if (!AST.getAliasAnalysis().isMustAlias(ASMemLoc, MemLoc)) {
+        Alias = SetMayAlias;
+        break;
+      }
 
   // Add it to the end of the list...
-  ++SetSize;
-  assert(*PtrListEnd == nullptr && "End of list is not null?");
-  *PtrListEnd = &Entry;
-  PtrListEnd = Entry.setPrevInList(PtrListEnd);
-  assert(*PtrListEnd == nullptr && "End of list is not null?");
-  // Entry points to alias set.
-  addRef();
-
-  if (Alias == SetMayAlias)
-    AST.TotalMayAliasSetSize++;
+  MemoryLocs.push_back(MemLoc);
+
+  AST.TotalAliasSetSize++;
 }
 
 void AliasSet::addUnknownInst(Instruction *I, BatchAAResults &AA) {
@@ -187,41 +153,21 @@ void AliasSet::addUnknownInst(Instruction *I, BatchAAResults &AA) {
 /// members in the set return the appropriate AliasResult. Otherwise return
 /// NoAlias.
 ///
-AliasResult AliasSet::aliasesPointer(const Value *Ptr, LocationSize Size,
-                                     const AAMDNodes &AAInfo,
-                                     BatchAAResults &AA) const {
+AliasResult AliasSet::aliasesPointer(const MemoryLocation &MemLoc, BatchAAResults &AA) const {
   if (AliasAny)
     return AliasResult::MayAlias;
 
-  if (Alias == SetMustAlias) {
-    assert(UnknownInsts.empty() && "Illegal must alias set!");
-
-    // If this is a set of MustAliases, only check to see if the pointer aliases
-    // SOME value in the set.
-    PointerRec *SomePtr = getSomePointer();
-    assert(SomePtr && "Empty must-alias set??");
-    return AA.alias(MemoryLocation(SomePtr->getValue(), SomePtr->getSize(),
-                                   SomePtr->getAAInfo()),
-                    MemoryLocation(Ptr, Size, AAInfo));
-  }
-
-  // If this is a may-alias set, we have to check all of the pointers in the set
-  // to be sure it doesn't alias the set...
-  for (iterator I = begin(), E = end(); I != E; ++I) {
-    AliasResult AR =
-        AA.alias(MemoryLocation(Ptr, Size, AAInfo),
-                 MemoryLocation(I.getPointer(), I.getSize(), I.getAAInfo()));
+  // Check all of the pointers in the set...
+  for (const auto& ASMemLoc : MemoryLocs) {
+    AliasResult AR = AA.alias(MemLoc, ASMemLoc);
     if (AR != AliasResult::NoAlias)
       return AR;
   }
 
   // Check the unknown instructions...
-  if (!UnknownInsts.empty()) {
-    for (Instruction *Inst : UnknownInsts)
-      if (isModOrRefSet(
-              AA.getModRefInfo(Inst, MemoryLocation(Ptr, Size, AAInfo))))
-        return AliasResult::MayAlias;
-  }
+  for (Instruction *Inst : UnknownInsts)
+    if (isModOrRefSet(AA.getModRefInfo(Inst, MemLoc)))
+      return AliasResult::MayAlias;
 
   return AliasResult::NoAlias;
 }
@@ -246,9 +192,8 @@ ModRefInfo AliasSet::aliasesUnknownInst(const Instruction *Inst,
   }
 
   ModRefInfo MR = ModRefInfo::NoModRef;
-  for (iterator I = begin(), E = end(); I != E; ++I) {
-    MR |= AA.getModRefInfo(
-        Inst, MemoryLocation(I.getPointer(), I.getSize(), I.getAAInfo()));
+  for (const auto& ASMemLoc: MemoryLocs) {
+    MR |= AA.getModRefInfo(Inst, ASMemLoc);
     if (isModAndRefSet(MR))
       return MR;
   }
@@ -257,13 +202,7 @@ ModRefInfo AliasSet::aliasesUnknownInst(const Instruction *Inst,
 }
 
 void AliasSetTracker::clear() {
-  // Delete all the PointerRec entries.
-  for (auto &I : PointerMap)
-    I.second->eraseFromList();
-
   PointerMap.clear();
-
-  // The alias sets should all be clear now.
   AliasSets.clear();
 }
 
@@ -271,9 +210,7 @@ void AliasSetTracker::clear() {
 /// alias the pointer. Return the unified set, or nullptr if no set that aliases
 /// the pointer was found. MustAliasAll is updated to true/false if the pointer
 /// is found to MustAlias all the sets it merged.
-AliasSet *AliasSetTracker::mergeAliasSetsForPointer(const Value *Ptr,
-                                                    LocationSize Size,
-                                                    const AAMDNodes &AAInfo,
+AliasSet *AliasSetTracker::mergeAliasSetsForPointer(const MemoryLocation &MemLoc,
                                                     bool &MustAliasAll) {
   AliasSet *FoundSet = nullptr;
   MustAliasAll = true;
@@ -281,7 +218,7 @@ AliasSet *AliasSetTracker::mergeAliasSetsForPointer(const Value *Ptr,
     if (AS.Forward)
       continue;
 
-    AliasResult AR = AS.aliasesPointer(Ptr, Size, AAInfo, AA);
+    AliasResult AR = AS.aliasesPointer(MemLoc, AA);
     if (AR == AliasResult::NoAlias)
       continue;
 
@@ -317,59 +254,45 @@ AliasSet *AliasSetTracker::findAliasSetForUnknownInst(Instruction *Inst) {
 }
 
 AliasSet &AliasSetTracker::getAliasSetFor(const MemoryLocation &MemLoc) {
+  // Check if this MemLoc is already registered
+  AliasSet *&AS = PointerMap[MemLoc];
+  if (AS) {
+    if (AS->Forward) {
+      // ...

@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2023

@llvm/pr-subscribers-llvm-analysis

Changes

This changes the AliasSetTracker to track memory locations instead of pointers in its alias sets. The motivation for this is outlined in an RFC posted on LLVM discourse.

This commit does not yet change method names and code comments to refer to "memory location(s)" instead of "pointer(s)", to keep the changeset focused, and to first gain feedback on the idea. An additional commit can take care of this.

In the data structures of the AST implementation, I made the choice to replace the linked list of PointerRec entries (that had to go anyway) with a simple flat vector of MemoryLocation objects, but for the AliasSet objects referenced from a lookup table, I retained the mechanism of a linked list, reference counting, forwarding, etc. The data structures could be revised in a follow-up change.

Patch is 114.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65731.diff

16 Files Affected:

  • (modified) llvm/include/llvm/Analysis/AliasSetTracker.h (+17-161)
  • (modified) llvm/include/llvm/LinkAllPasses.h (+1-2)
  • (modified) llvm/lib/Analysis/AliasSetTracker.cpp (+81-159)
  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+5-5)
  • (modified) llvm/lib/Transforms/Scalar/LICM.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Scalar/LoopVersioningLICM.cpp (+3-3)
  • (modified) llvm/test/Analysis/AliasSet/argmemonly.ll (+16-16)
  • (modified) llvm/test/Analysis/AliasSet/guards.ll (+108-108)
  • (modified) llvm/test/Analysis/AliasSet/intrinsics.ll (+5-5)
  • (modified) llvm/test/Analysis/AliasSet/memset.ll (+4-4)
  • (modified) llvm/test/Analysis/AliasSet/memtransfer.ll (+14-14)
  • (modified) llvm/test/Analysis/AliasSet/saturation.ll (+15-17)
  • (added) llvm/test/Transforms/LICM/variant-aainfo.ll (+65)
  • (modified) polly/lib/Analysis/ScopBuilder.cpp (+4-2)
  • (modified) polly/lib/Analysis/ScopDetection.cpp (+2-2)
  • (modified) polly/lib/Analysis/ScopDetectionDiagnostic.cpp (+2-1)
diff --git a/llvm/include/llvm/Analysis/AliasSetTracker.h b/llvm/include/llvm/Analysis/AliasSetTracker.h
index e485e1ff2f4c987..fe4e6e7e7acba91 100644
--- a/llvm/include/llvm/Analysis/AliasSetTracker.h
+++ b/llvm/include/llvm/Analysis/AliasSetTracker.h
@@ -49,99 +49,12 @@ class Value;
 class AliasSet : public ilist_node {
   friend class AliasSetTracker;
 
-  class PointerRec {
-    Value *Val;  // The pointer this record corresponds to.
-    PointerRec **PrevInList = nullptr;
-    PointerRec *NextInList = nullptr;
-    AliasSet *AS = nullptr;
-    LocationSize Size = LocationSize::mapEmpty();
-    AAMDNodes AAInfo;
-
-    // Whether the size for this record has been set at all. This makes no
-    // guarantees about the size being known.
-    bool isSizeSet() const { return Size != LocationSize::mapEmpty(); }
-
-  public:
-    PointerRec(Value *V)
-      : Val(V), AAInfo(DenseMapInfo::getEmptyKey()) {}
-
-    Value *getValue() const { return Val; }
-
-    PointerRec *getNext() const { return NextInList; }
-    bool hasAliasSet() const { return AS != nullptr; }
-
-    PointerRec** setPrevInList(PointerRec **PIL) {
-      PrevInList = PIL;
-      return &NextInList;
-    }
-
-    bool updateSizeAndAAInfo(LocationSize NewSize, const AAMDNodes &NewAAInfo) {
-      bool SizeChanged = false;
-      if (NewSize != Size) {
-        LocationSize OldSize = Size;
-        Size = isSizeSet() ? Size.unionWith(NewSize) : NewSize;
-        SizeChanged = OldSize != Size;
-      }
-
-      if (AAInfo == DenseMapInfo::getEmptyKey())
-        // We don't have a AAInfo yet. Set it to NewAAInfo.
-        AAInfo = NewAAInfo;
-      else {
-        AAMDNodes Intersection(AAInfo.intersect(NewAAInfo));
-        SizeChanged |= Intersection != AAInfo;
-        AAInfo = Intersection;
-      }
-      return SizeChanged;
-    }
-
-    LocationSize getSize() const {
-      assert(isSizeSet() && "Getting an unset size!");
-      return Size;
-    }
-
-    /// Return the AAInfo, or null if there is no information or conflicting
-    /// information.
-    AAMDNodes getAAInfo() const {
-      // If we have missing or conflicting AAInfo, return null.
-      if (AAInfo == DenseMapInfo::getEmptyKey() ||
-          AAInfo == DenseMapInfo::getTombstoneKey())
-        return AAMDNodes();
-      return AAInfo;
-    }
-
-    AliasSet *getAliasSet(AliasSetTracker &AST) {
-      assert(AS && "No AliasSet yet!");
-      if (AS->Forward) {
-        AliasSet *OldAS = AS;
-        AS = OldAS->getForwardedTarget(AST);
-        AS->addRef();
-        OldAS->dropRef(AST);
-      }
-      return AS;
-    }
-
-    void setAliasSet(AliasSet *as) {
-      assert(!AS && "Already have an alias set!");
-      AS = as;
-    }
-
-    void eraseFromList() {
-      if (NextInList) NextInList->PrevInList = PrevInList;
-      *PrevInList = NextInList;
-      if (AS->PtrListEnd == &NextInList) {
-        AS->PtrListEnd = PrevInList;
-        assert(*AS->PtrListEnd == nullptr && "List not terminated right!");
-      }
-      delete this;
-    }
-  };
-
-  // Doubly linked list of nodes.
-  PointerRec *PtrList = nullptr;
-  PointerRec **PtrListEnd;
   // Forwarding pointer.
   AliasSet *Forward = nullptr;
 
+  /// Memory locations in this alias set.
+  std::vector MemoryLocs;
+
   /// All instructions without a specific address in this alias set.
   std::vector> UnknownInsts;
 
@@ -178,8 +91,6 @@ class AliasSet : public ilist_node {
   };
   unsigned Alias : 1;
 
-  unsigned SetSize = 0;
-
   void addRef() { ++RefCount; }
 
   void dropRef(AliasSetTracker &AST) {
@@ -207,66 +118,21 @@ class AliasSet : public ilist_node {
 
   // Alias Set iteration - Allow access to all of the pointers which are part of
   // this alias set.
-  class iterator;
-  iterator begin() const { return iterator(PtrList); }
-  iterator end()   const { return iterator(); }
-  bool empty() const { return PtrList == nullptr; }
+  using iterator = std::vector::const_iterator;
+  iterator begin() const { return MemoryLocs.begin(); }
+  iterator end()   const { return MemoryLocs.end(); }
 
-  // Unfortunately, ilist::size() is linear, so we have to add code to keep
-  // track of the list's exact size.
-  unsigned size() { return SetSize; }
+  unsigned size() { return MemoryLocs.size(); }
 
   void print(raw_ostream &OS) const;
   void dump() const;
 
-  /// Define an iterator for alias sets... this is just a forward iterator.
-  class iterator {
-    PointerRec *CurNode;
-
-  public:
-    using iterator_category = std::forward_iterator_tag;
-    using value_type = PointerRec;
-    using difference_type = std::ptrdiff_t;
-    using pointer = value_type *;
-    using reference = value_type &;
-
-    explicit iterator(PointerRec *CN = nullptr) : CurNode(CN) {}
-
-    bool operator==(const iterator& x) const {
-      return CurNode == x.CurNode;
-    }
-    bool operator!=(const iterator& x) const { return !operator==(x); }
-
-    value_type &operator*() const {
-      assert(CurNode && "Dereferencing AliasSet.end()!");
-      return *CurNode;
-    }
-    value_type *operator->() const { return &operator*(); }
-
-    Value *getPointer() const { return CurNode->getValue(); }
-    LocationSize getSize() const { return CurNode->getSize(); }
-    AAMDNodes getAAInfo() const { return CurNode->getAAInfo(); }
-
-    iterator& operator++() {                // Preincrement
-      assert(CurNode && "Advancing past AliasSet.end()!");
-      CurNode = CurNode->getNext();
-      return *this;
-    }
-    iterator operator++(int) { // Postincrement
-      iterator tmp = *this; ++*this; return tmp;
-    }
-  };
-
 private:
   // Can only be created by AliasSetTracker.
   AliasSet()
-      : PtrListEnd(&PtrList), RefCount(0),  AliasAny(false), Access(NoAccess),
+      : RefCount(0), AliasAny(false), Access(NoAccess),
         Alias(SetMustAlias) {}
 
-  PointerRec *getSomePointer() const {
-    return PtrList;
-  }
-
   /// Return the real alias set this represents. If this has been merged with
   /// another set and is forwarding, return the ultimate destination set. This
   /// also implements the union-find collapsing as well.
@@ -284,16 +150,16 @@ class AliasSet : public ilist_node {
 
   void removeFromTracker(AliasSetTracker &AST);
 
-  void addPointer(AliasSetTracker &AST, PointerRec &Entry, LocationSize Size,
-                  const AAMDNodes &AAInfo, bool KnownMustAlias = false,
-                  bool SkipSizeUpdate = false);
+  bool isMustAliasMergeWith(AliasSet &AS, BatchAAResults &BatchAA) const;
+  void addPointer(AliasSetTracker &AST, const MemoryLocation &MemLoc,
+                  bool KnownMustAlias = false);
   void addUnknownInst(Instruction *I, BatchAAResults &AA);
 
 public:
   /// If the specified pointer "may" (or must) alias one of the members in the
   /// set return the appropriate AliasResult. Otherwise return NoAlias.
-  AliasResult aliasesPointer(const Value *Ptr, LocationSize Size,
-                             const AAMDNodes &AAInfo, BatchAAResults &AA) const;
+  AliasResult aliasesPointer(const MemoryLocation &MemLoc, BatchAAResults &AA) const;
+
   ModRefInfo aliasesUnknownInst(const Instruction *Inst,
                                 BatchAAResults &AA) const;
 };
@@ -307,7 +173,7 @@ class AliasSetTracker {
   BatchAAResults &AA;
   ilist AliasSets;
 
-  using PointerMapType = DenseMap, AliasSet::PointerRec *>;
+  using PointerMapType = DenseMap;
 
   // Map from pointers to their node
   PointerMapType PointerMap;
@@ -330,7 +196,7 @@ class AliasSetTracker {
   /// These methods return true if inserting the instruction resulted in the
   /// addition of a new alias set (i.e., the pointer did not alias anything).
   ///
-  void add(Value *Ptr, LocationSize Size, const AAMDNodes &AAInfo); // Add a loc
+  void add(const MemoryLocation &Loc); // Add a loc
   void add(LoadInst *LI);
   void add(StoreInst *SI);
   void add(VAArgInst *VAAI);
@@ -371,7 +237,7 @@ class AliasSetTracker {
   friend class AliasSet;
 
   // The total number of pointers contained in all "may" alias sets.
-  unsigned TotalMayAliasSetSize = 0;
+  unsigned TotalAliasSetSize = 0;
 
   // A non-null value signifies this AST is saturated. A saturated AST lumps
   // all pointers into a single "May" set.
@@ -379,18 +245,8 @@ class AliasSetTracker {
 
   void removeAliasSet(AliasSet *AS);
 
-  /// Just like operator[] on the map, except that it creates an entry for the
-  /// pointer if it doesn't already exist.
-  AliasSet::PointerRec &getEntryFor(Value *V) {
-    AliasSet::PointerRec *&Entry = PointerMap[V];
-    if (!Entry)
-      Entry = new AliasSet::PointerRec(V);
-    return *Entry;
-  }
-
   AliasSet &addPointer(MemoryLocation Loc, AliasSet::AccessLattice E);
-  AliasSet *mergeAliasSetsForPointer(const Value *Ptr, LocationSize Size,
-                                     const AAMDNodes &AAInfo,
+  AliasSet *mergeAliasSetsForPointer(const MemoryLocation& MemLoc,
                                      bool &MustAliasAll);
 
   /// Merge all alias sets into a single set that is considered to alias any
diff --git a/llvm/include/llvm/LinkAllPasses.h b/llvm/include/llvm/LinkAllPasses.h
index 081a15077e7be4c..a4ec0b337129787 100644
--- a/llvm/include/llvm/LinkAllPasses.h
+++ b/llvm/include/llvm/LinkAllPasses.h
@@ -163,8 +163,7 @@ namespace {
       llvm::AliasAnalysis AA(TLI);
       llvm::BatchAAResults BAA(AA);
       llvm::AliasSetTracker X(BAA);
-      X.add(nullptr, llvm::LocationSize::beforeOrAfterPointer(),
-            llvm::AAMDNodes()); // for -print-alias-sets
+      X.add(llvm::MemoryLocation()); // for -print-alias-sets
       (void) llvm::AreStatisticsEnabled();
       (void) llvm::sys::RunningOnValgrind();
     }
diff --git a/llvm/lib/Analysis/AliasSetTracker.cpp b/llvm/lib/Analysis/AliasSetTracker.cpp
index 91b889116dfa2d3..f5554b193694539 100644
--- a/llvm/lib/Analysis/AliasSetTracker.cpp
+++ b/llvm/lib/Analysis/AliasSetTracker.cpp
@@ -40,36 +40,37 @@ static cl::opt
                         cl::desc("The maximum number of pointers may-alias "
                                  "sets may contain before degradation"));
 
+bool AliasSet::isMustAliasMergeWith(AliasSet &AS, BatchAAResults &BatchAA) const {
+  for (const MemoryLocation &MemLoc : MemoryLocs)
+    for (const MemoryLocation &ASMemLoc : AS.MemoryLocs)
+      if (!BatchAA.isMustAlias(MemLoc, ASMemLoc))
+        return false;
+  return true;
+}
+
 /// mergeSetIn - Merge the specified alias set into this alias set.
 void AliasSet::mergeSetIn(AliasSet &AS, AliasSetTracker &AST,
                           BatchAAResults &BatchAA) {
   assert(!AS.Forward && "Alias set is already forwarding!");
   assert(!Forward && "This set is a forwarding set!!");
 
-  bool WasMustAlias = (Alias == SetMustAlias);
   // Update the alias and access types of this set...
   Access |= AS.Access;
   Alias  |= AS.Alias;
 
   if (Alias == SetMustAlias) {
-    // Check that these two merged sets really are must aliases.  Since both
-    // used to be must-alias sets, we can just check any pointer from each set
-    // for aliasing.
-    PointerRec *L = getSomePointer();
-    PointerRec *R = AS.getSomePointer();
-
+    // Check that these two merged sets really are must aliases.
     // If the pointers are not a must-alias pair, this set becomes a may alias.
-    if (!BatchAA.isMustAlias(
-            MemoryLocation(L->getValue(), L->getSize(), L->getAAInfo()),
-            MemoryLocation(R->getValue(), R->getSize(), R->getAAInfo())))
+    if (!isMustAliasMergeWith(AS, BatchAA))
       Alias = SetMayAlias;
   }
 
-  if (Alias == SetMayAlias) {
-    if (WasMustAlias)
-      AST.TotalMayAliasSetSize += size();
-    if (AS.Alias == SetMustAlias)
-      AST.TotalMayAliasSetSize += AS.size();
+  // Merge the list of constituent pointers...
+  if (MemoryLocs.empty()) {
+    std::swap(MemoryLocs, AS.MemoryLocs);
+  } else {
+    llvm::append_range(MemoryLocs, AS.MemoryLocs);
+    AS.MemoryLocs.clear();
   }
 
   bool ASHadUnknownInsts = !AS.UnknownInsts.empty();
@@ -86,18 +87,6 @@ void AliasSet::mergeSetIn(AliasSet &AS, AliasSetTracker &AST,
   AS.Forward = this; // Forward across AS now...
   addRef();          // AS is now pointing to us...
 
-  // Merge the list of constituent pointers...
-  if (AS.PtrList) {
-    SetSize += AS.size();
-    AS.SetSize = 0;
-    *PtrListEnd = AS.PtrList;
-    AS.PtrList->setPrevInList(PtrListEnd);
-    PtrListEnd = AS.PtrListEnd;
-
-    AS.PtrList = nullptr;
-    AS.PtrListEnd = &AS.PtrList;
-    assert(*AS.PtrListEnd == nullptr && "End of list is not null?");
-  }
   if (ASHadUnknownInsts)
     AS.dropRef(AST);
 }
@@ -106,9 +95,8 @@ void AliasSetTracker::removeAliasSet(AliasSet *AS) {
   if (AliasSet *Fwd = AS->Forward) {
     Fwd->dropRef(*this);
     AS->Forward = nullptr;
-  } else // Update TotalMayAliasSetSize only if not forwarding.
-      if (AS->Alias == AliasSet::SetMayAlias)
-        TotalMayAliasSetSize -= AS->size();
+  } else // Update TotalAliasSetSize only if not forwarding.
+      TotalAliasSetSize -= AS->size();
 
   AliasSets.erase(AS);
   // If we've removed the saturated alias set, set saturated marker back to
@@ -124,42 +112,20 @@ void AliasSet::removeFromTracker(AliasSetTracker &AST) {
   AST.removeAliasSet(this);
 }
 
-void AliasSet::addPointer(AliasSetTracker &AST, PointerRec &Entry,
-                          LocationSize Size, const AAMDNodes &AAInfo,
-                          bool KnownMustAlias, bool SkipSizeUpdate) {
-  assert(!Entry.hasAliasSet() && "Entry already in set!");
-
+void AliasSet::addPointer(AliasSetTracker &AST, const MemoryLocation& MemLoc,
+                          bool KnownMustAlias) {
   // Check to see if we have to downgrade to _may_ alias.
-  if (isMustAlias())
-    if (PointerRec *P = getSomePointer()) {
-      if (!KnownMustAlias) {
-        BatchAAResults &AA = AST.getAliasAnalysis();
-        AliasResult Result = AA.alias(
-            MemoryLocation(P->getValue(), P->getSize(), P->getAAInfo()),
-            MemoryLocation(Entry.getValue(), Size, AAInfo));
-        if (Result != AliasResult::MustAlias) {
-          Alias = SetMayAlias;
-          AST.TotalMayAliasSetSize += size();
-        }
-        assert(Result != AliasResult::NoAlias && "Cannot be part of must set!");
-      } else if (!SkipSizeUpdate)
-        P->updateSizeAndAAInfo(Size, AAInfo);
-    }
-
-  Entry.setAliasSet(this);
-  Entry.updateSizeAndAAInfo(Size, AAInfo);
+  if (isMustAlias() && !KnownMustAlias)
+    for (const MemoryLocation& ASMemLoc : MemoryLocs)
+      if (!AST.getAliasAnalysis().isMustAlias(ASMemLoc, MemLoc)) {
+        Alias = SetMayAlias;
+        break;
+      }
 
   // Add it to the end of the list...
-  ++SetSize;
-  assert(*PtrListEnd == nullptr && "End of list is not null?");
-  *PtrListEnd = &Entry;
-  PtrListEnd = Entry.setPrevInList(PtrListEnd);
-  assert(*PtrListEnd == nullptr && "End of list is not null?");
-  // Entry points to alias set.
-  addRef();
-
-  if (Alias == SetMayAlias)
-    AST.TotalMayAliasSetSize++;
+  MemoryLocs.push_back(MemLoc);
+
+  AST.TotalAliasSetSize++;
 }
 
 void AliasSet::addUnknownInst(Instruction *I, BatchAAResults &AA) {
@@ -187,41 +153,21 @@ void AliasSet::addUnknownInst(Instruction *I, BatchAAResults &AA) {
 /// members in the set return the appropriate AliasResult. Otherwise return
 /// NoAlias.
 ///
-AliasResult AliasSet::aliasesPointer(const Value *Ptr, LocationSize Size,
-                                     const AAMDNodes &AAInfo,
-                                     BatchAAResults &AA) const {
+AliasResult AliasSet::aliasesPointer(const MemoryLocation &MemLoc, BatchAAResults &AA) const {
   if (AliasAny)
     return AliasResult::MayAlias;
 
-  if (Alias == SetMustAlias) {
-    assert(UnknownInsts.empty() && "Illegal must alias set!");
-
-    // If this is a set of MustAliases, only check to see if the pointer aliases
-    // SOME value in the set.
-    PointerRec *SomePtr = getSomePointer();
-    assert(SomePtr && "Empty must-alias set??");
-    return AA.alias(MemoryLocation(SomePtr->getValue(), SomePtr->getSize(),
-                                   SomePtr->getAAInfo()),
-                    MemoryLocation(Ptr, Size, AAInfo));
-  }
-
-  // If this is a may-alias set, we have to check all of the pointers in the set
-  // to be sure it doesn't alias the set...
-  for (iterator I = begin(), E = end(); I != E; ++I) {
-    AliasResult AR =
-        AA.alias(MemoryLocation(Ptr, Size, AAInfo),
-                 MemoryLocation(I.getPointer(), I.getSize(), I.getAAInfo()));
+  // Check all of the pointers in the set...
+  for (const auto& ASMemLoc : MemoryLocs) {
+    AliasResult AR = AA.alias(MemLoc, ASMemLoc);
     if (AR != AliasResult::NoAlias)
       return AR;
   }
 
   // Check the unknown instructions...
-  if (!UnknownInsts.empty()) {
-    for (Instruction *Inst : UnknownInsts)
-      if (isModOrRefSet(
-              AA.getModRefInfo(Inst, MemoryLocation(Ptr, Size, AAInfo))))
-        return AliasResult::MayAlias;
-  }
+  for (Instruction *Inst : UnknownInsts)
+    if (isModOrRefSet(AA.getModRefInfo(Inst, MemLoc)))
+      return AliasResult::MayAlias;
 
   return AliasResult::NoAlias;
 }
@@ -246,9 +192,8 @@ ModRefInfo AliasSet::aliasesUnknownInst(const Instruction *Inst,
   }
 
   ModRefInfo MR = ModRefInfo::NoModRef;
-  for (iterator I = begin(), E = end(); I != E; ++I) {
-    MR |= AA.getModRefInfo(
-        Inst, MemoryLocation(I.getPointer(), I.getSize(), I.getAAInfo()));
+  for (const auto& ASMemLoc: MemoryLocs) {
+    MR |= AA.getModRefInfo(Inst, ASMemLoc);
     if (isModAndRefSet(MR))
       return MR;
   }
@@ -257,13 +202,7 @@ ModRefInfo AliasSet::aliasesUnknownInst(const Instruction *Inst,
 }
 
 void AliasSetTracker::clear() {
-  // Delete all the PointerRec entries.
-  for (auto &I : PointerMap)
-    I.second->eraseFromList();
-
   PointerMap.clear();
-
-  // The alias sets should all be clear now.
   AliasSets.clear();
 }
 
@@ -271,9 +210,7 @@ void AliasSetTracker::clear() {
 /// alias the pointer. Return the unified set, or nullptr if no set that aliases
 /// the pointer was found. MustAliasAll is updated to true/false if the pointer
 /// is found to MustAlias all the sets it merged.
-AliasSet *AliasSetTracker::mergeAliasSetsForPointer(const Value *Ptr,
-                                                    LocationSize Size,
-                                                    const AAMDNodes &AAInfo,
+AliasSet *AliasSetTracker::mergeAliasSetsForPointer(const MemoryLocation &MemLoc,
                                                     bool &MustAliasAll) {
   AliasSet *FoundSet = nullptr;
   MustAliasAll = true;
@@ -281,7 +218,7 @@ AliasSet *AliasSetTracker::mergeAliasSetsForPointer(const Value *Ptr,
     if (AS.Forward)
       continue;
 
-    AliasResult AR = AS.aliasesPointer(Ptr, Size, AAInfo, AA);
+    AliasResult AR = AS.aliasesPointer(MemLoc, AA);
     if (AR == AliasResult::NoAlias)
       continue;
 
@@ -317,59 +254,45 @@ AliasSet *AliasSetTracker::findAliasSetForUnknownInst(Instruction *Inst) {
 }
 
 AliasSet &AliasSetTracker::getAliasSetFor(const MemoryLocation &MemLoc) {
+  // Check if this MemLoc is already registered
+  AliasSet *&AS = PointerMap[MemLoc];
+  if (AS) {
+    if (AS->Forward) {
+      // ...

@brunodf-snps
Copy link
Contributor Author

Tagging some possible reviewers:
r? @nikic
r? @jdoerfert
r? @efriedma-quic

Copy link
Member

@jdoerfert jdoerfert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left comments. Generally, I think this is the right way to go. We had the custom rolled stuff that was expanded from MemLoc, and used to create MemLoc all the time anyway.

Do we have compile time data? Maybe like this and w/ a SetVector instead of a plain vector?

void addPointer(AliasSetTracker &AST, PointerRec &Entry, LocationSize Size,
const AAMDNodes &AAInfo, bool KnownMustAlias = false,
bool SkipSizeUpdate = false);
bool isMustAliasMergeWith(AliasSet &AS, BatchAAResults &BatchAA) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs documentation. isMustAlias and MergeWith? I'm confused.

Copy link
Contributor Author

@brunodf-snps brunodf-snps Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation was added in commit e1c2fbf. Actually, the auxiliary function was only introduced to exit elegantly from a nested loop. I still removed the function from the class, and made it internal to the AliasSetTracker.cpp file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the auxiliary function was only introduced to exit elegantly from a nested loop.

This was perhaps unnecessarily heavy. I still removed the function, and replaced it by an immediately-invoked lambda, so the logic is directly visible in AliasSet::mergeSetIn.

bool empty() const { return PtrList == nullptr; }
using iterator = std::vector<MemoryLocation>::const_iterator;
iterator begin() const { return MemoryLocs.begin(); }
iterator end() const { return MemoryLocs.end(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it reduce the casts for the user if we had non-const iterators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source of the const-casts is not the const-iterator, it is inside MemoryLocation.

I think it would be possible to provide a convenience version of getPointers that integrates the const-cast so the client would not be burdened with this, but the caller would still be responsible to ensure the cast is valid. Is there a precedent of this in LLVM? I'm only aware of methods that intregrate const-casts when we know it is valid (example).

@@ -307,7 +173,7 @@ class AliasSetTracker {
BatchAAResults &AA;
ilist<AliasSet> AliasSets;

using PointerMapType = DenseMap<AssertingVH<Value>, AliasSet::PointerRec *>;
using PointerMapType = DenseMap<MemoryLocation, AliasSet *>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A MemoryLocation is not super small, I think 7 or 8 pointer sized objects, roughly. Unsure if this could cause issues at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get why this needs to be indexed by MemoryLocation. I'd still expect all MemoryLocations for the same Value to be in the same AliasSet, so it seems like the existing indexing should work fine...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still expect all MemoryLocations for the same Value to be in the same AliasSet

I was unsure if this always holds. In case of full restrict support, I think the same pointer value is used for the two stores in the following snippet:

int * restrict r = p;
*p = 123;
*r = 456;

But they have different provenance and metadata, and ScopedNoAliasAA will report that they do not alias. But this snippet has undefined behavior (because of course, the accesses do alias).

I guess that BasicAA will quickly return MustAlias for a query with the same pointer Value regardless of LocationSize, metadata, etc.? Then indeed another AA can only say NoAlias in an illegal situation.

Copy link
Contributor Author

@brunodf-snps brunodf-snps Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get why this needs to be indexed by MemoryLocation [...] it seems like the existing indexing should work fine...

I still tried this, see this branch. The pointer map can be smaller, but this is traded for extra work in AliasSetTracker::getAliasSetFor, see commit 23443f3. It turns out that two memory locations for the same UndefValue do end up in different alias sets (see this BasicAA logic) so I have to exclude them from the pointer map, which complicates the logic in AliasSetTracker::getAliasSetFor quite a bit. On the plus side, a lots of tests can be reverted, see commit 92f01a2, since we can restore the "pointer values" output in the alias set dump, but I don't know if that is much of an argument.

Let me know if you consider the version on the branch an improvement, then I will include it in this pull request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still realized there was a problem with commit 23443f3. When a memory location with an undef pointer value is added to an alias set, that alias set is not referenced from the pointer map entry of that pointer value (because the pointer value may appear in multiple alias sets). But without the reference to the alias set from some access structure, the reference count of the set may reach zero, and the alias set may be dropped. (For similar reason, the reference count of an alias set is increased by 1 when at least one unknown instruction is registered in it.)

I solved this problem (on the aforementioned branch) by keeping a collection with alias sets that contain undef pointer values. This keeps those alias sets referenced, so they are not dropped. It also allows to scan only those alias sets, when we are trying to find if a memory location with an undef pointer value is already registered or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Yes, AA treats undef as noalias, but as far as I know AST specifically wants the undefs to be part of the same set, or at least that is how I interpret the comment at

// If the size changed, we may need to merge several alias sets.
// Note that we can *not* return the result of mergeAliasSetsForPointer
// due to a quirk of alias analysis behavior. Since alias(undef, undef)
// is NoAlias, mergeAliasSetsForPointer(undef, ...) will not find the
// the right set for undef, even if it exists.
. It seems like you are going out of your way to have undefs in different sets -- why?

More generally I would expect that all MemLocs that have the same pointer will end up in the same alias set, even if AA reports them as NoAlias (as implied by metadata or undef). After all they are also MustAlias, and treating them as such makes things a lot simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that when using a map from pointer values, it is more logical to simply place MemLocs in an existing set of the pointer value, even if AA reports NoAlias for this set. I've updated the ast-pointer-map branch to this effect, in particular, see:
https://github.com/brunodf-snps/llvm-project/blob/9e9ebcd53aff011cff2b4d7ebdf439855806894b/llvm/lib/Analysis/AliasSetTracker.cpp#L312-L319

But be aware that in a program where AA indicates that two instruction using an undef pointer are not aliasing, putting them in the same alias set is information loss (although it may not be relevant):

%val = load i32, ptr undef
...
store ptr null, ptr undef

So I would disagree with the cited code comment, or at least find it misleading:

Since alias(undef, undef) is NoAlias, mergeAliasSetsForPointer(undef, ...) will not find the the right set for undef, even if it exists.

mergeAliasSetsForPointer is correctly indicating (with a nullptr result) that there is no set that the second memory location aliases with, according to AA. The existing set for the first memory location with an undef pointer is only a "right" set insofar that the implementation with a pointer map cannot accurately represent the situation. (But it may be a fine tradeoff to be inaccurate here.)

It seems like you are going out of your way to have undefs in different sets -- why?

Well, I have roughly 2 objectives for AST with this patch: (1) avoid information loss and (2) minimal knowledge about AA or MemoryLocation internals. So I am reluctant to add information loss for an implementation organization (the pointer map). But you seem to favor the pointer map for conceptual reasons too, so then I would propose what is on my ast-pointer-map branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Information loss is unavoidable here, because both NoAlias and MustAlias are correct results for two undef pointers. You're always going to lose one or the other. We can chose which one we pick.

The current AST implementation picks MustAlias, and I don't see a reason why we would want to change that. Creating a new set for every undef pointer doesn't make sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that both NoAlias and MustAlias are correct results here and you can choose which one to pick, but my reasoning was that AA already picks NoAlias and so I attempted to represent the same thing in AST (when I say I strive for no information loss, I essentially mean that AST should construct alias sets consistent with AA results -- the motivating example is a case where AA is indicating NoAlias, and current AST is ending up with MayAlias due to implementation organization).

But I am fine with an implementation where all MemoryLocations for the same pointer Value are by construction in the same AliasSet, i.e. where the PointerMap stays. The discussion has shown that the impact is only for undef pointer values, and I expect the difference is of very little practical relevance (as you basically write, creating a new set for every undef pointer does not seem to have added value). So I have merged my ast-pointer-map branch into this PR now, so the combined set of changes that I propose is visible here, and we can move forward with that.

bool AliasSet::isMustAliasMergeWith(AliasSet &AS, BatchAAResults &BatchAA) const {
for (const MemoryLocation &MemLoc : MemoryLocs)
for (const MemoryLocation &ASMemLoc : AS.MemoryLocs)
if (!BatchAA.isMustAlias(MemLoc, ASMemLoc))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must-alias is not a transitive relationship? Before we picked a representative and only checked those, why do we need to check all entries now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is discussed in the RFC, using a representative also implies merging of the LocationSize, AAInfo,... of the other accesses into it: https://discourse.llvm.org/t/rfc-dont-merge-memory-locations-in-aliassettracker/73336#mustalias-sets-saturation-4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though a representative is not possible, @nikic convinced me that any must-alias pair is sufficient because must-alias is transitive. See commit 5f82129.

@@ -551,7 +472,7 @@ AliasSet &AliasSetTracker::addPointer(MemoryLocation Loc,
AliasSet &AS = getAliasSetFor(Loc);
AS.Access |= E;

if (!AliasAnyAS && (TotalMayAliasSetSize > SaturationThreshold)) {
if (!AliasAnyAS && (TotalAliasSetSize > SaturationThreshold)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if we should increase the threshold now. I guess we can wait and see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do no action at the moment.

@brunodf-snps
Copy link
Contributor Author

@alinas
Copy link
Contributor

alinas commented Oct 17, 2023

The proposal SGTM!
I was somewhat surprised this doesn't affect compile time at all, with it doing AA checks on all MemoryLocation pairs, but this seems to indeed not have any impact (tested on internal benchmarks too). My assumption is that most alias checks are taking advantage of the cache in BatchAA.

I'll follow up with inline comments.
+1 to update method names to match the change to tracking memory locations, and update comments.

nikic added a commit to nikic/llvm-project that referenced this pull request Nov 29, 2023
AST checks aliasing with MustAlias sets by only checking the
representative pointer (getSomePointer). This is only correct
if the Size and AATags information of that pointer also includes
the Size/AATags of all other pointers in the set.

When we add a new pointer to the AliasSet, we do perform this
update (see the code in AliasSet::addPointer). However, if a
pointer already in the MustAlias set is used with a new size,
we currently do not update the representative pointer, resulting
in miscompilations. Fix this by adding the missing update.

This is targeted fix using the current representation. There are
a couple of alternatives:
 * For MustAlias sets, don't store per-pointer Size/AATags at all.
   This would make it clear that there is only one set of common
   Size/AATags for all pointers.
 * Check against all pointers in the set even for MustAlias. This
   is what llvm#65731
   proposes to do as part of a larger change to AST representation.

Fixes llvm#64897.
nikic added a commit that referenced this pull request Dec 7, 2023
AST checks aliasing with MustAlias sets by only checking the
representative pointer (getSomePointer). This is only correct if the
Size and AATags information of that pointer also includes the
Size/AATags of all other pointers in the set.

When we add a new pointer to the AliasSet, we do perform this update
(see the code in AliasSet::addPointer). However, if a pointer already in
the MustAlias set is used with a new size, we currently do not update
the representative pointer, resulting in miscompilations. Fix this by
adding the missing update.

This is a targeted fix using the current representation. There are a
couple of alternatives:
* For MustAlias sets, don't store per-pointer Size/AATags at all. This
would make it clear that there is only one set of common Size/AATags for
all pointers.
* Check against all pointers in the set even for MustAlias. This is what
#65731 proposes to do as part
of a larger change to AST representation.

Fixes #64897.
@brunodf-snps brunodf-snps force-pushed the ast-dont-merge-memory-locations branch from 54371b9 to 7caf0bc Compare December 12, 2023 13:55
Copy link

github-actions bot commented Dec 12, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@brunodf-snps brunodf-snps force-pushed the ast-dont-merge-memory-locations branch from 93d4b9e to 9304962 Compare December 12, 2023 14:31
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just some comments on the test, before it can be pre-committed.)

@brunodf-snps brunodf-snps force-pushed the ast-dont-merge-memory-locations branch 2 times, most recently from 54a97d1 to 2af6cb3 Compare December 12, 2023 22:57
@nikic
Copy link
Contributor

nikic commented Dec 13, 2023

I have landed 358e765, c030778 and 3c9236c. Note that I slightly modified the second commit to drop the extra wrapper method.

@brunodf-snps brunodf-snps force-pushed the ast-dont-merge-memory-locations branch from 2af6cb3 to 56a6a53 Compare December 13, 2023 23:32
@brunodf-snps
Copy link
Contributor Author

I have landed 358e765, c030778 and 3c9236c. Note that I slightly modified the second commit to drop the extra wrapper method.

Thanks! I have rebased again to make the precommitted changes disappear from this PR.

This allows to scan only these sets when we want to check if a memory
locations was already registered, but importantly, it also keeps these
alias sets referenced.
@brunodf-snps
Copy link
Contributor Author

Apart from updating the method names and comments (which is the reason why I still keep the PR marked as draft), I think I have now addressed all of the comments up to this point.

@@ -307,7 +173,7 @@ class AliasSetTracker {
BatchAAResults &AA;
ilist<AliasSet> AliasSets;

using PointerMapType = DenseMap<AssertingVH<Value>, AliasSet::PointerRec *>;
using PointerMapType = DenseMap<MemoryLocation, AliasSet *>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Yes, AA treats undef as noalias, but as far as I know AST specifically wants the undefs to be part of the same set, or at least that is how I interpret the comment at

// If the size changed, we may need to merge several alias sets.
// Note that we can *not* return the result of mergeAliasSetsForPointer
// due to a quirk of alias analysis behavior. Since alias(undef, undef)
// is NoAlias, mergeAliasSetsForPointer(undef, ...) will not find the
// the right set for undef, even if it exists.
. It seems like you are going out of your way to have undefs in different sets -- why?

More generally I would expect that all MemLocs that have the same pointer will end up in the same alias set, even if AA reports them as NoAlias (as implied by metadata or undef). After all they are also MustAlias, and treating them as such makes things a lot simpler.

Simply place them in the earlier alias set that was known for them,
although this entails information loss.
// (This is only known to occur for undef pointer values, which AA treats as
// noalias.)
AS = MapEntry->getForwardedTarget(*this);
AS->Alias = AliasSet::SetMayAlias;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to use SetMayAlias here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, AA has indicated that the new memory location is no-alias with any set, including the set from the map entry. If that set was a must-alias set, it would seem that we have to downgrade to may-alias when adding the new memory location there to be consistent with the no-alias result?

I suppose your reasoning is that any two memory locations with the same pointer value are always must-alias (even if AA indicates they are also no-alias), so we know that the new memory location is also must-alias with the earlier memory location in the set for which map entry was added, and by transitivity with all memory locations in the set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. I doubt this really matters in practice, but I think keeping the MustAlias here should be safe (and very marginally simpler...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. But then I think it is better to fully build in that the we assume a must-alias result for the existing alias set with a memory location using the same pointer value, and use this in mergeAliasSetsForPointer. I've pushed an updated version, see new commits b7fbb33 and ad4fa84. (This is maybe what you were after with your earlier remark about not calling mergeAliasSetsForPointer when there is a MapEntry: while I think we must call the method and do merging, we can save some alias queries there.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, your updated version looks very clean!

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good toe me.

@brunodf-snps
Copy link
Contributor Author

This looks good to me.

Thanks! I will still do a pass over the method names and comments to use "memory location" terminology.

@jdoerfert I was hoping to have this landed before the llvm-18 branch is cut (Jan 23, plus account for some days margin). Do you still have remarks?

@brunodf-snps brunodf-snps marked this pull request as ready for review January 16, 2024 14:59
@brunodf-snps
Copy link
Contributor Author

I have updated the comments and method names to update the terminology, and I have cleared the draft label now.

To me, this is ready to be merged, I don't think there is anything I can still address.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic nikic merged commit 656bf13 into llvm:main Jan 17, 2024
@brunodf-snps
Copy link
Contributor Author

Thanks for committing. main has a new test since my branch point llvm/test/Analysis/BasicAA/separate_storage-alias-sets.ll
that is failing, let me fix that.

@brunodf-snps
Copy link
Contributor Author

@nikic
Copy link
Contributor

nikic commented Jan 17, 2024

@brunodf-snps Thanks, committed in 509f634.

ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
This changes the AliasSetTracker to track memory locations instead of
pointers in its alias sets. The motivation for this is outlined in an RFC
posted on LLVM discourse:
https://discourse.llvm.org/t/rfc-dont-merge-memory-locations-in-aliassettracker/73336

In the data structures of the AST implementation, I made the choice to
replace the linked list of `PointerRec` entries (that had to go anyway)
with a simple flat vector of `MemoryLocation` objects, but for the
`AliasSet` objects referenced from a lookup table, I retained the
mechanism of a linked list, reference counting, forwarding, etc. The
data structures could be revised in a follow-up change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants