Skip to content

[EquivalenceClasses] Use SmallVector for deterministic iteration order. #134075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 2, 2025

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Apr 2, 2025

Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but close to noise:
https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions

@fhahn fhahn requested review from arsenm, nikic and shraiysh April 2, 2025 12:26
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 backend:AMDGPU tablegen llvm:globalisel clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms llvm:adt labels Apr 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 2, 2025

@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-tablegen
@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-adt

@llvm/pr-subscribers-clang

Author: Florian Hahn (fhahn)

Changes

Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but close to noise:
https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions


Patch is 21.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134075.diff

16 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp (+4-4)
  • (modified) llvm/include/llvm/ADT/EquivalenceClasses.h (+16-9)
  • (modified) llvm/lib/Analysis/VectorUtils.cpp (+4-4)
  • (modified) llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp (+3-2)
  • (modified) llvm/lib/Transforms/IPO/LowerTypeTests.cpp (+5-24)
  • (modified) llvm/lib/Transforms/Scalar/Float2Int.cpp (+6-4)
  • (modified) llvm/lib/Transforms/Scalar/LoopDistribute.cpp (+3-3)
  • (modified) llvm/lib/Transforms/Utils/SplitModule.cpp (+8-24)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td (+1-1)
  • (modified) llvm/test/Transforms/LowerTypeTests/function-disjoint.ll (+6-6)
  • (modified) llvm/test/Transforms/LowerTypeTests/nonstring.ll (+1-1)
  • (modified) llvm/test/tools/llvm-split/preserve-locals.ll (+7-6)
  • (modified) llvm/test/tools/llvm-split/scc-const-alias.ll (+5-5)
  • (modified) llvm/test/tools/llvm-split/scc-global-alias.ll (+8-8)
  • (modified) llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp (+2-1)
diff --git a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
index 69a90334c9df5..3c385ed8ef663 100644
--- a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
+++ b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
@@ -111,8 +111,8 @@ void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
     FalseAtoms = projectToLeaders(FalseAtoms, EquivalentAtoms);
 
     llvm::DenseMap<Atom, const Formula *> Substitutions;
-    for (auto It = EquivalentAtoms.begin(); It != EquivalentAtoms.end(); ++It) {
-      Atom TheAtom = It->getData();
+    for (const auto &E : EquivalentAtoms) {
+      Atom TheAtom = E->getData();
       Atom Leader = EquivalentAtoms.getLeaderValue(TheAtom);
       if (TrueAtoms.contains(Leader)) {
         if (FalseAtoms.contains(Leader)) {
@@ -152,9 +152,9 @@ void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
 
   if (Info) {
     for (const auto &E : EquivalentAtoms) {
-      if (!E.isLeader())
+      if (!E->isLeader())
         continue;
-      Atom At = *EquivalentAtoms.findLeader(E);
+      Atom At = *EquivalentAtoms.findLeader(*E);
       if (TrueAtoms.contains(At) || FalseAtoms.contains(At))
         continue;
       llvm::SmallVector<Atom> Atoms =
diff --git a/llvm/include/llvm/ADT/EquivalenceClasses.h b/llvm/include/llvm/ADT/EquivalenceClasses.h
index 46f186a71f5ce..906971baf74af 100644
--- a/llvm/include/llvm/ADT/EquivalenceClasses.h
+++ b/llvm/include/llvm/ADT/EquivalenceClasses.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_ADT_EQUIVALENCECLASSES_H
 #define LLVM_ADT_EQUIVALENCECLASSES_H
 
+#include "llvm/ADT/SmallVector.h"
 #include <cassert>
 #include <cstddef>
 #include <cstdint>
@@ -138,6 +139,9 @@ class EquivalenceClasses {
   /// ECValues, it just keeps the key as part of the value.
   std::set<ECValue, ECValueComparator> TheMapping;
 
+  /// List of all members, used to provide a determinstic iteration order.
+  SmallVector<const ECValue *> Members;
+
 public:
   EquivalenceClasses() = default;
   EquivalenceClasses(const EquivalenceClasses &RHS) {
@@ -146,9 +150,10 @@ class EquivalenceClasses {
 
   EquivalenceClasses &operator=(const EquivalenceClasses &RHS) {
     TheMapping.clear();
+    Members.clear();
     for (const auto &E : RHS)
-      if (E.isLeader()) {
-        member_iterator MI = RHS.member_begin(E);
+      if (E->isLeader()) {
+        member_iterator MI = RHS.member_begin(*E);
         member_iterator LeaderIt = member_begin(insert(*MI));
         for (++MI; MI != member_end(); ++MI)
           unionSets(LeaderIt, member_begin(insert(*MI)));
@@ -161,11 +166,10 @@ class EquivalenceClasses {
   //
 
   /// iterator* - Provides a way to iterate over all values in the set.
-  using iterator =
-      typename std::set<ECValue, ECValueComparator>::const_iterator;
+  using iterator = typename SmallVector<const ECValue *>::const_iterator;
 
-  iterator begin() const { return TheMapping.begin(); }
-  iterator end() const { return TheMapping.end(); }
+  iterator begin() const { return Members.begin(); }
+  iterator end() const { return Members.end(); }
 
   bool empty() const { return TheMapping.empty(); }
 
@@ -173,7 +177,7 @@ class EquivalenceClasses {
   class member_iterator;
   member_iterator member_begin(const ECValue &ECV) const {
     // Only leaders provide anything to iterate over.
-    return member_iterator(ECV.isLeader() ? ECV.getLeader() : nullptr);
+    return member_iterator(ECV.isLeader() ? &ECV : nullptr);
   }
 
   member_iterator member_end() const {
@@ -208,7 +212,7 @@ class EquivalenceClasses {
   unsigned getNumClasses() const {
     unsigned NC = 0;
     for (const auto &E : *this)
-      if (E.isLeader())
+      if (E->isLeader())
         ++NC;
     return NC;
   }
@@ -219,7 +223,10 @@ class EquivalenceClasses {
   /// insert - Insert a new value into the union/find set, ignoring the request
   /// if the value already exists.
   const ECValue &insert(const ElemTy &Data) {
-    return *TheMapping.insert(ECValue(Data)).first;
+    auto I = TheMapping.insert(ECValue(Data));
+    if (I.second)
+      Members.push_back(&*I.first);
+    return *I.first;
   }
 
   /// findLeader - Given a value in the set, return a member iterator for the
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index f57186589a325..663b961da848d 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -844,10 +844,10 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
         DBits[ECs.getOrInsertLeaderValue(I.first)] |= ~0ULL;
 
   for (const auto &E : ECs) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
     uint64_t LeaderDemandedBits = 0;
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end()))
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end()))
       LeaderDemandedBits |= DBits[M];
 
     uint64_t MinBW = llvm::bit_width(LeaderDemandedBits);
@@ -859,7 +859,7 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
     // indvars.
     // If we are required to shrink a PHI, abandon this entire equivalence class.
     bool Abort = false;
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end()))
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end()))
       if (isa<PHINode>(M) && MinBW < M->getType()->getScalarSizeInBits()) {
         Abort = true;
         break;
@@ -867,7 +867,7 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
     if (Abort)
       continue;
 
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end())) {
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end())) {
       auto *MI = dyn_cast<Instruction>(M);
       if (!MI)
         continue;
diff --git a/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp b/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
index c6d40cb00b252..218cded84d76b 100644
--- a/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
+++ b/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
@@ -368,9 +368,9 @@ bool AArch64A57FPLoadBalancing::runOnBasicBlock(MachineBasicBlock &MBB) {
   // Convert the EquivalenceClasses to a simpler set of sets.
   std::vector<std::vector<Chain*> > V;
   for (const auto &E : EC) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
-    std::vector<Chain *> Cs(EC.member_begin(E), EC.member_end());
+    std::vector<Chain *> Cs(EC.member_begin(*E), EC.member_end());
     if (Cs.empty()) continue;
     V.push_back(std::move(Cs));
   }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 4a700bd213ed5..32472201cf9c2 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -1017,11 +1017,12 @@ void RecursiveSearchSplitting::setupWorkList() {
   }
 
   for (const auto &Node : NodeEC) {
-    if (!Node.isLeader())
+    if (!Node->isLeader())
       continue;
 
     BitVector Cluster = SG.createNodesBitVector();
-    for (auto MI = NodeEC.member_begin(Node); MI != NodeEC.member_end(); ++MI) {
+    for (auto MI = NodeEC.member_begin(*Node); MI != NodeEC.member_end();
+         ++MI) {
       const SplitGraph::Node &N = SG.getNode(*MI);
       if (N.isGraphEntryPoint())
         N.getDependencies(Cluster);
diff --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
index 11f9f0271395b..fcd8918f1d9d7 100644
--- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
@@ -2339,36 +2339,17 @@ bool LowerTypeTestsModule::lower() {
   if (GlobalClasses.empty())
     return false;
 
-  // Build a list of disjoint sets ordered by their maximum global index for
-  // determinism.
-  std::vector<std::pair<GlobalClassesTy::iterator, unsigned>> Sets;
-  for (GlobalClassesTy::iterator I = GlobalClasses.begin(),
-                                 E = GlobalClasses.end();
-       I != E; ++I) {
-    if (!I->isLeader())
+  // For each disjoint set we found...
+  for (const auto &C : GlobalClasses) {
+    if (!C->isLeader())
       continue;
-    ++NumTypeIdDisjointSets;
-
-    unsigned MaxUniqueId = 0;
-    for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(*I);
-         MI != GlobalClasses.member_end(); ++MI) {
-      if (auto *MD = dyn_cast_if_present<Metadata *>(*MI))
-        MaxUniqueId = std::max(MaxUniqueId, TypeIdInfo[MD].UniqueId);
-      else if (auto *BF = dyn_cast_if_present<ICallBranchFunnel *>(*MI))
-        MaxUniqueId = std::max(MaxUniqueId, BF->UniqueId);
-    }
-    Sets.emplace_back(I, MaxUniqueId);
-  }
-  llvm::sort(Sets, llvm::less_second());
 
-  // For each disjoint set we found...
-  for (const auto &S : Sets) {
+    ++NumTypeIdDisjointSets;
     // Build the list of type identifiers in this disjoint set.
     std::vector<Metadata *> TypeIds;
     std::vector<GlobalTypeMember *> Globals;
     std::vector<ICallBranchFunnel *> ICallBranchFunnels;
-    for (GlobalClassesTy::member_iterator MI =
-             GlobalClasses.member_begin(*S.first);
+    for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(*C);
          MI != GlobalClasses.member_end(); ++MI) {
       if (isa<Metadata *>(*MI))
         TypeIds.push_back(cast<Metadata *>(*MI));
diff --git a/llvm/lib/Transforms/Scalar/Float2Int.cpp b/llvm/lib/Transforms/Scalar/Float2Int.cpp
index 85c376c564d35..927877b3135e5 100644
--- a/llvm/lib/Transforms/Scalar/Float2Int.cpp
+++ b/llvm/lib/Transforms/Scalar/Float2Int.cpp
@@ -312,14 +312,16 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
 
   // Iterate over every disjoint partition of the def-use graph.
   for (const auto &E : ECs) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
+
     ConstantRange R(MaxIntegerBW + 1, false);
     bool Fail = false;
     Type *ConvertedToTy = nullptr;
 
     // For every member of the partition, union all the ranges together.
-    for (auto MI = ECs.member_begin(E), ME = ECs.member_end(); MI != ME; ++MI) {
+    for (auto MI = ECs.member_begin(*E), ME = ECs.member_end(); MI != ME;
+         ++MI) {
       Instruction *I = *MI;
       auto SeenI = SeenInsts.find(I);
       if (SeenI == SeenInsts.end())
@@ -349,7 +351,7 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
 
     // If the set was empty, or we failed, or the range is poisonous,
     // bail out.
-    if (ECs.member_begin(E) == ECs.member_end() || Fail || R.isFullSet() ||
+    if (ECs.member_begin(*E) == ECs.member_end() || Fail || R.isFullSet() ||
         R.isSignWrappedSet())
       continue;
     assert(ConvertedToTy && "Must have set the convertedtoty by this point!");
@@ -389,7 +391,7 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
       }
     }
 
-    for (auto MI = ECs.member_begin(E), ME = ECs.member_end(); MI != ME; ++MI)
+    for (auto MI = ECs.member_begin(*E), ME = ECs.member_end(); MI != ME; ++MI)
       convert(*MI, Ty);
     MadeChange = true;
   }
diff --git a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
index 5f03d854b51e6..e10d0c0defd96 100644
--- a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
@@ -386,11 +386,11 @@ class InstPartitionContainer {
     // Merge the member of an equivalence class into its class leader.  This
     // makes the members empty.
     for (const auto &C : ToBeMerged) {
-      if (!C.isLeader())
+      if (!C->isLeader())
         continue;
 
-      auto PartI = C.getData();
-      for (auto *PartJ : make_range(std::next(ToBeMerged.member_begin(C)),
+      auto PartI = C->getData();
+      for (auto *PartJ : make_range(std::next(ToBeMerged.member_begin(*C)),
                                     ToBeMerged.member_end())) {
         PartJ->moveTo(*PartI);
       }
diff --git a/llvm/lib/Transforms/Utils/SplitModule.cpp b/llvm/lib/Transforms/Utils/SplitModule.cpp
index 507e3c7a42737..c39771733ee0d 100644
--- a/llvm/lib/Transforms/Utils/SplitModule.cpp
+++ b/llvm/lib/Transforms/Utils/SplitModule.cpp
@@ -170,40 +170,24 @@ static void findPartitions(Module &M, ClusterIDMapType &ClusterIDMap,
   for (unsigned i = 0; i < N; ++i)
     BalancingQueue.push(std::make_pair(i, 0));
 
-  using SortType = std::pair<unsigned, ClusterMapType::iterator>;
-
-  SmallVector<SortType, 64> Sets;
   SmallPtrSet<const GlobalValue *, 32> Visited;
 
   // To guarantee determinism, we have to sort SCC according to size.
   // When size is the same, use leader's name.
-  for (ClusterMapType::iterator I = GVtoClusterMap.begin(),
-                                E = GVtoClusterMap.end();
-       I != E; ++I)
-    if (I->isLeader())
-      Sets.push_back(
-          std::make_pair(std::distance(GVtoClusterMap.member_begin(*I),
-                                       GVtoClusterMap.member_end()),
-                         I));
-
-  llvm::sort(Sets, [](const SortType &a, const SortType &b) {
-    if (a.first == b.first)
-      return a.second->getData()->getName() > b.second->getData()->getName();
-    else
-      return a.first > b.first;
-  });
-
-  for (auto &I : Sets) {
+  for (const auto &C : GVtoClusterMap) {
+    if (!C->isLeader())
+      continue;
+
     unsigned CurrentClusterID = BalancingQueue.top().first;
     unsigned CurrentClusterSize = BalancingQueue.top().second;
     BalancingQueue.pop();
 
     LLVM_DEBUG(dbgs() << "Root[" << CurrentClusterID << "] cluster_size("
-                      << I.first << ") ----> " << I.second->getData()->getName()
-                      << "\n");
+                      << std::distance(GVtoClusterMap.member_begin(*C),
+                                       GVtoClusterMap.member_end())
+                      << ") ----> " << C->getData()->getName() << "\n");
 
-    for (ClusterMapType::member_iterator MI =
-             GVtoClusterMap.findLeader(*I.second);
+    for (ClusterMapType::member_iterator MI = GVtoClusterMap.findLeader(*C);
          MI != GVtoClusterMap.member_end(); ++MI) {
       if (!Visited.insert(*MI).second)
         continue;
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
index ed4e0e411c7af..a965dc7ccdcad 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
@@ -50,7 +50,7 @@ def infer_complex_tempreg: GICombineRule <
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_match_1:          [dst, x]
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_apply_0:          [tmp, y]
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_apply_1:
-// CHECK-NEXT: Final Type Equivalence Classes: [tmp, dst, x, y]  [vec]
+// CHECK-NEXT: Final Type Equivalence Classes: [vec] [tmp, dst, x, y]
 // CHECK-NEXT: INFER: MachineOperand $tmp -> GITypeOf<$dst>
 // CHECK-NEXT: Apply patterns for rule infer_variadic_outs after inference:
 // CHECK-NEXT:   (CodeGenInstructionPattern name:__infer_variadic_outs_apply_0 G_FNEG operands:[<def>GITypeOf<$dst>:$tmp, $y])
diff --git a/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll b/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
index 92281e274adf0..a3fdb3cad94ca 100644
--- a/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
+++ b/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
@@ -5,8 +5,8 @@
 
 target datalayout = "e-p:64:64"
 
-; X64: @f = alias void (), ptr @[[JT0:.*]]
 ; X64: @g = alias void (), ptr @[[JT1:.*]]
+; X64: @f = alias void (), ptr @[[JT0:.*]]
 
 ; WASM32: private constant [0 x i8] zeroinitializer
 @0 = private unnamed_addr constant [2 x ptr] [ptr @f, ptr @g], align 16
@@ -30,20 +30,20 @@ declare i1 @llvm.type.test(ptr %ptr, metadata %bitset) nounwind readnone
 
 define i1 @foo(ptr %p) {
   ; X64: icmp eq i64 {{.*}}, ptrtoint (ptr @[[JT0]] to i64)
-  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 1) to i64)
+  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 2) to i64)
   %x = call i1 @llvm.type.test(ptr %p, metadata !"typeid1")
   ; X64: icmp eq i64 {{.*}}, ptrtoint (ptr @[[JT1]] to i64)
-  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 2) to i64)
+  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 1) to i64)
   %y = call i1 @llvm.type.test(ptr %p, metadata !"typeid2")
   %z = add i1 %x, %y
   ret i1 %z
 }
 
-; X64: define private void @[[JT0]]() #{{.*}} align 8 {
-; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @f.cfi)
-
 ; X64: define private void @[[JT1]]() #{{.*}} align 8 {
 ; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @g.cfi)
 
+; X64: define private void @[[JT0]]() #{{.*}} align 8 {
+; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @f.cfi)
+
 ; WASM32: ![[I0]] = !{i64 1}
 ; WASM32: ![[I1]] = !{i64 2}
diff --git a/llvm/test/Transforms/LowerTypeTests/nonstring.ll b/llvm/test/Transforms/LowerTypeTests/nonstring.ll
index ff8cc52d48344..ff7189aa0189c 100644
--- a/llvm/test/Transforms/LowerTypeTests/nonstring.ll
+++ b/llvm/test/Transforms/LowerTypeTests/nonstring.ll
@@ -4,8 +4,8 @@
 
 target datalayout = "e-p:32:32"
 
-; CHECK: @[[ANAME:.*]] = private constant { i32 }
 ; CHECK: @[[BNAME:.*]] = private constant { [2 x i32] }
+; CHECK: @[[ANAME:.*]] = private constant { i32 }
 
 @a = constant i32 1, !type !0
 @b = constant [2 x i32] [i32 2, i32 3], !type !1
diff --git a/llvm/test/tools/llvm-split/preserve-locals.ll b/llvm/test/tools/llvm-split/preserve-locals.ll
index d128daaf35dd7..ff0610bd65499 100644
--- a/llvm/test/tools/llvm-split/preserve-locals.ll
+++ b/llvm/test/tools/llvm-split/preserve-locals.ll
@@ -2,14 +2,15 @@
 ; RUN: llvm-dis -o - %t0 | FileCheck --check-prefix=CHECK0 %s
 ; RUN: llvm-dis -o - %t1 | FileCheck --check-prefix=CHECK1 %s
 
-; The local_var and local_func must not be separated.
-; CHECK0: @local_var
-; CHECK0: define internal fastcc void @local_func
 ; The main and a must not be separated.
 ; The main and local_func must not be together.
-; CHECK1: @a
-; CHECK1: define i32 @main
-; CHECK1: declare dso_local fastcc void @local_func
+; CHECK0: @a
+; CHECK0: define i32 @main
+; CHECK0: declare dso_local fastcc void @local_func
+
+; The local_var and local_func must not be separated.
+; CHECK1: @local_var
+; CHECK1: define internal fastcc void @local_func
 
 @a = internal global i32 0, align 4
 @global_storage = common global i32 0, align 4
diff --git a/llvm/test/tools/llvm-split/scc-const-alias.ll b/llvm/test/tools/llvm-split/scc-const-alias.ll
index 20670af416c44..9e66f38f50843 100644
--- a/llvm/test/tools/llvm-split/scc-const-alias.ll
+++ b/llvm/test/tools/llvm-split/scc-const-alias.ll
@@ -5,12 +5,12 @@
 ; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 %s
 
 ; Checks are not critical here - verifier will assert if we fail.
-; CHECK0: @g1 = global i32 99
-; CHECK0: @c1Alias = external global i8
-; CHECK0: @g1Alias = internal alias i8, ptr @g1
+; CHECK0: @g1 = external global i32
+; CHECK0: @c1Alias = internal alias i8, inttoptr (i64 42 to ptr)
 
-; CHECK1: @g1 = external global i32
-; CHECK1: @c1Alias = internal alias i8, inttoptr (i64 42 to ptr)
+; CHECK1: @g1 = global i32 99
+; CHECK1: @c1Alias = external global i8
+; CHECK1: @g1Alias = internal alias i8, ptr @g1
 
 ; Third file is actually empty.
 ; CHECK2: @g1 = external global i32
diff --git a/llvm/test/tools/llvm-split/scc-global-alias.ll b/llvm/test/tools/llvm-split/scc-global-alias.ll
index ee3b6a1c1ce1a..b3b52ccd535a0 100644
--- a/llvm/test/tools/llvm-split/scc-global-alias.ll
+++ b/llvm/test/tools/llvm-split/scc-global-alias.ll
@@ -5,16 +5,16 @@
 ; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 %s
 
 ; Checks are not critical here - verifier will assert if we fail.
-; CHECK0: @funInternal2Alias = alias
-; CHECK0: @funExternal2Alias = alias
-; CHECK0: define internal i32 @funInternal2
-; CHECK0: define i32 @funExternal2
+; CHECK0: @funInternalAlias = alias
+; CHECK0: define internal i32 @funInternal
 
-; CHECK1: @funInternalAlias = alias
-; CHECK1: define internal i32 @funInternal
+; CHECK1: @funExternalAlias = alias
+; CHECK1: define...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 2, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Florian Hahn (fhahn)

Changes

Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but close to noise:
https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&amp;to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&amp;stat=instructions


Patch is 21.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134075.diff

16 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp (+4-4)
  • (modified) llvm/include/llvm/ADT/EquivalenceClasses.h (+16-9)
  • (modified) llvm/lib/Analysis/VectorUtils.cpp (+4-4)
  • (modified) llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp (+3-2)
  • (modified) llvm/lib/Transforms/IPO/LowerTypeTests.cpp (+5-24)
  • (modified) llvm/lib/Transforms/Scalar/Float2Int.cpp (+6-4)
  • (modified) llvm/lib/Transforms/Scalar/LoopDistribute.cpp (+3-3)
  • (modified) llvm/lib/Transforms/Utils/SplitModule.cpp (+8-24)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td (+1-1)
  • (modified) llvm/test/Transforms/LowerTypeTests/function-disjoint.ll (+6-6)
  • (modified) llvm/test/Transforms/LowerTypeTests/nonstring.ll (+1-1)
  • (modified) llvm/test/tools/llvm-split/preserve-locals.ll (+7-6)
  • (modified) llvm/test/tools/llvm-split/scc-const-alias.ll (+5-5)
  • (modified) llvm/test/tools/llvm-split/scc-global-alias.ll (+8-8)
  • (modified) llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp (+2-1)
diff --git a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
index 69a90334c9df5..3c385ed8ef663 100644
--- a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
+++ b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
@@ -111,8 +111,8 @@ void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
     FalseAtoms = projectToLeaders(FalseAtoms, EquivalentAtoms);
 
     llvm::DenseMap<Atom, const Formula *> Substitutions;
-    for (auto It = EquivalentAtoms.begin(); It != EquivalentAtoms.end(); ++It) {
-      Atom TheAtom = It->getData();
+    for (const auto &E : EquivalentAtoms) {
+      Atom TheAtom = E->getData();
       Atom Leader = EquivalentAtoms.getLeaderValue(TheAtom);
       if (TrueAtoms.contains(Leader)) {
         if (FalseAtoms.contains(Leader)) {
@@ -152,9 +152,9 @@ void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
 
   if (Info) {
     for (const auto &E : EquivalentAtoms) {
-      if (!E.isLeader())
+      if (!E->isLeader())
         continue;
-      Atom At = *EquivalentAtoms.findLeader(E);
+      Atom At = *EquivalentAtoms.findLeader(*E);
       if (TrueAtoms.contains(At) || FalseAtoms.contains(At))
         continue;
       llvm::SmallVector<Atom> Atoms =
diff --git a/llvm/include/llvm/ADT/EquivalenceClasses.h b/llvm/include/llvm/ADT/EquivalenceClasses.h
index 46f186a71f5ce..906971baf74af 100644
--- a/llvm/include/llvm/ADT/EquivalenceClasses.h
+++ b/llvm/include/llvm/ADT/EquivalenceClasses.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_ADT_EQUIVALENCECLASSES_H
 #define LLVM_ADT_EQUIVALENCECLASSES_H
 
+#include "llvm/ADT/SmallVector.h"
 #include <cassert>
 #include <cstddef>
 #include <cstdint>
@@ -138,6 +139,9 @@ class EquivalenceClasses {
   /// ECValues, it just keeps the key as part of the value.
   std::set<ECValue, ECValueComparator> TheMapping;
 
+  /// List of all members, used to provide a determinstic iteration order.
+  SmallVector<const ECValue *> Members;
+
 public:
   EquivalenceClasses() = default;
   EquivalenceClasses(const EquivalenceClasses &RHS) {
@@ -146,9 +150,10 @@ class EquivalenceClasses {
 
   EquivalenceClasses &operator=(const EquivalenceClasses &RHS) {
     TheMapping.clear();
+    Members.clear();
     for (const auto &E : RHS)
-      if (E.isLeader()) {
-        member_iterator MI = RHS.member_begin(E);
+      if (E->isLeader()) {
+        member_iterator MI = RHS.member_begin(*E);
         member_iterator LeaderIt = member_begin(insert(*MI));
         for (++MI; MI != member_end(); ++MI)
           unionSets(LeaderIt, member_begin(insert(*MI)));
@@ -161,11 +166,10 @@ class EquivalenceClasses {
   //
 
   /// iterator* - Provides a way to iterate over all values in the set.
-  using iterator =
-      typename std::set<ECValue, ECValueComparator>::const_iterator;
+  using iterator = typename SmallVector<const ECValue *>::const_iterator;
 
-  iterator begin() const { return TheMapping.begin(); }
-  iterator end() const { return TheMapping.end(); }
+  iterator begin() const { return Members.begin(); }
+  iterator end() const { return Members.end(); }
 
   bool empty() const { return TheMapping.empty(); }
 
@@ -173,7 +177,7 @@ class EquivalenceClasses {
   class member_iterator;
   member_iterator member_begin(const ECValue &ECV) const {
     // Only leaders provide anything to iterate over.
-    return member_iterator(ECV.isLeader() ? ECV.getLeader() : nullptr);
+    return member_iterator(ECV.isLeader() ? &ECV : nullptr);
   }
 
   member_iterator member_end() const {
@@ -208,7 +212,7 @@ class EquivalenceClasses {
   unsigned getNumClasses() const {
     unsigned NC = 0;
     for (const auto &E : *this)
-      if (E.isLeader())
+      if (E->isLeader())
         ++NC;
     return NC;
   }
@@ -219,7 +223,10 @@ class EquivalenceClasses {
   /// insert - Insert a new value into the union/find set, ignoring the request
   /// if the value already exists.
   const ECValue &insert(const ElemTy &Data) {
-    return *TheMapping.insert(ECValue(Data)).first;
+    auto I = TheMapping.insert(ECValue(Data));
+    if (I.second)
+      Members.push_back(&*I.first);
+    return *I.first;
   }
 
   /// findLeader - Given a value in the set, return a member iterator for the
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index f57186589a325..663b961da848d 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -844,10 +844,10 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
         DBits[ECs.getOrInsertLeaderValue(I.first)] |= ~0ULL;
 
   for (const auto &E : ECs) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
     uint64_t LeaderDemandedBits = 0;
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end()))
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end()))
       LeaderDemandedBits |= DBits[M];
 
     uint64_t MinBW = llvm::bit_width(LeaderDemandedBits);
@@ -859,7 +859,7 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
     // indvars.
     // If we are required to shrink a PHI, abandon this entire equivalence class.
     bool Abort = false;
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end()))
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end()))
       if (isa<PHINode>(M) && MinBW < M->getType()->getScalarSizeInBits()) {
         Abort = true;
         break;
@@ -867,7 +867,7 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
     if (Abort)
       continue;
 
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end())) {
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end())) {
       auto *MI = dyn_cast<Instruction>(M);
       if (!MI)
         continue;
diff --git a/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp b/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
index c6d40cb00b252..218cded84d76b 100644
--- a/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
+++ b/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
@@ -368,9 +368,9 @@ bool AArch64A57FPLoadBalancing::runOnBasicBlock(MachineBasicBlock &MBB) {
   // Convert the EquivalenceClasses to a simpler set of sets.
   std::vector<std::vector<Chain*> > V;
   for (const auto &E : EC) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
-    std::vector<Chain *> Cs(EC.member_begin(E), EC.member_end());
+    std::vector<Chain *> Cs(EC.member_begin(*E), EC.member_end());
     if (Cs.empty()) continue;
     V.push_back(std::move(Cs));
   }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 4a700bd213ed5..32472201cf9c2 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -1017,11 +1017,12 @@ void RecursiveSearchSplitting::setupWorkList() {
   }
 
   for (const auto &Node : NodeEC) {
-    if (!Node.isLeader())
+    if (!Node->isLeader())
       continue;
 
     BitVector Cluster = SG.createNodesBitVector();
-    for (auto MI = NodeEC.member_begin(Node); MI != NodeEC.member_end(); ++MI) {
+    for (auto MI = NodeEC.member_begin(*Node); MI != NodeEC.member_end();
+         ++MI) {
       const SplitGraph::Node &N = SG.getNode(*MI);
       if (N.isGraphEntryPoint())
         N.getDependencies(Cluster);
diff --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
index 11f9f0271395b..fcd8918f1d9d7 100644
--- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
@@ -2339,36 +2339,17 @@ bool LowerTypeTestsModule::lower() {
   if (GlobalClasses.empty())
     return false;
 
-  // Build a list of disjoint sets ordered by their maximum global index for
-  // determinism.
-  std::vector<std::pair<GlobalClassesTy::iterator, unsigned>> Sets;
-  for (GlobalClassesTy::iterator I = GlobalClasses.begin(),
-                                 E = GlobalClasses.end();
-       I != E; ++I) {
-    if (!I->isLeader())
+  // For each disjoint set we found...
+  for (const auto &C : GlobalClasses) {
+    if (!C->isLeader())
       continue;
-    ++NumTypeIdDisjointSets;
-
-    unsigned MaxUniqueId = 0;
-    for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(*I);
-         MI != GlobalClasses.member_end(); ++MI) {
-      if (auto *MD = dyn_cast_if_present<Metadata *>(*MI))
-        MaxUniqueId = std::max(MaxUniqueId, TypeIdInfo[MD].UniqueId);
-      else if (auto *BF = dyn_cast_if_present<ICallBranchFunnel *>(*MI))
-        MaxUniqueId = std::max(MaxUniqueId, BF->UniqueId);
-    }
-    Sets.emplace_back(I, MaxUniqueId);
-  }
-  llvm::sort(Sets, llvm::less_second());
 
-  // For each disjoint set we found...
-  for (const auto &S : Sets) {
+    ++NumTypeIdDisjointSets;
     // Build the list of type identifiers in this disjoint set.
     std::vector<Metadata *> TypeIds;
     std::vector<GlobalTypeMember *> Globals;
     std::vector<ICallBranchFunnel *> ICallBranchFunnels;
-    for (GlobalClassesTy::member_iterator MI =
-             GlobalClasses.member_begin(*S.first);
+    for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(*C);
          MI != GlobalClasses.member_end(); ++MI) {
       if (isa<Metadata *>(*MI))
         TypeIds.push_back(cast<Metadata *>(*MI));
diff --git a/llvm/lib/Transforms/Scalar/Float2Int.cpp b/llvm/lib/Transforms/Scalar/Float2Int.cpp
index 85c376c564d35..927877b3135e5 100644
--- a/llvm/lib/Transforms/Scalar/Float2Int.cpp
+++ b/llvm/lib/Transforms/Scalar/Float2Int.cpp
@@ -312,14 +312,16 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
 
   // Iterate over every disjoint partition of the def-use graph.
   for (const auto &E : ECs) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
+
     ConstantRange R(MaxIntegerBW + 1, false);
     bool Fail = false;
     Type *ConvertedToTy = nullptr;
 
     // For every member of the partition, union all the ranges together.
-    for (auto MI = ECs.member_begin(E), ME = ECs.member_end(); MI != ME; ++MI) {
+    for (auto MI = ECs.member_begin(*E), ME = ECs.member_end(); MI != ME;
+         ++MI) {
       Instruction *I = *MI;
       auto SeenI = SeenInsts.find(I);
       if (SeenI == SeenInsts.end())
@@ -349,7 +351,7 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
 
     // If the set was empty, or we failed, or the range is poisonous,
     // bail out.
-    if (ECs.member_begin(E) == ECs.member_end() || Fail || R.isFullSet() ||
+    if (ECs.member_begin(*E) == ECs.member_end() || Fail || R.isFullSet() ||
         R.isSignWrappedSet())
       continue;
     assert(ConvertedToTy && "Must have set the convertedtoty by this point!");
@@ -389,7 +391,7 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
       }
     }
 
-    for (auto MI = ECs.member_begin(E), ME = ECs.member_end(); MI != ME; ++MI)
+    for (auto MI = ECs.member_begin(*E), ME = ECs.member_end(); MI != ME; ++MI)
       convert(*MI, Ty);
     MadeChange = true;
   }
diff --git a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
index 5f03d854b51e6..e10d0c0defd96 100644
--- a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
@@ -386,11 +386,11 @@ class InstPartitionContainer {
     // Merge the member of an equivalence class into its class leader.  This
     // makes the members empty.
     for (const auto &C : ToBeMerged) {
-      if (!C.isLeader())
+      if (!C->isLeader())
         continue;
 
-      auto PartI = C.getData();
-      for (auto *PartJ : make_range(std::next(ToBeMerged.member_begin(C)),
+      auto PartI = C->getData();
+      for (auto *PartJ : make_range(std::next(ToBeMerged.member_begin(*C)),
                                     ToBeMerged.member_end())) {
         PartJ->moveTo(*PartI);
       }
diff --git a/llvm/lib/Transforms/Utils/SplitModule.cpp b/llvm/lib/Transforms/Utils/SplitModule.cpp
index 507e3c7a42737..c39771733ee0d 100644
--- a/llvm/lib/Transforms/Utils/SplitModule.cpp
+++ b/llvm/lib/Transforms/Utils/SplitModule.cpp
@@ -170,40 +170,24 @@ static void findPartitions(Module &M, ClusterIDMapType &ClusterIDMap,
   for (unsigned i = 0; i < N; ++i)
     BalancingQueue.push(std::make_pair(i, 0));
 
-  using SortType = std::pair<unsigned, ClusterMapType::iterator>;
-
-  SmallVector<SortType, 64> Sets;
   SmallPtrSet<const GlobalValue *, 32> Visited;
 
   // To guarantee determinism, we have to sort SCC according to size.
   // When size is the same, use leader's name.
-  for (ClusterMapType::iterator I = GVtoClusterMap.begin(),
-                                E = GVtoClusterMap.end();
-       I != E; ++I)
-    if (I->isLeader())
-      Sets.push_back(
-          std::make_pair(std::distance(GVtoClusterMap.member_begin(*I),
-                                       GVtoClusterMap.member_end()),
-                         I));
-
-  llvm::sort(Sets, [](const SortType &a, const SortType &b) {
-    if (a.first == b.first)
-      return a.second->getData()->getName() > b.second->getData()->getName();
-    else
-      return a.first > b.first;
-  });
-
-  for (auto &I : Sets) {
+  for (const auto &C : GVtoClusterMap) {
+    if (!C->isLeader())
+      continue;
+
     unsigned CurrentClusterID = BalancingQueue.top().first;
     unsigned CurrentClusterSize = BalancingQueue.top().second;
     BalancingQueue.pop();
 
     LLVM_DEBUG(dbgs() << "Root[" << CurrentClusterID << "] cluster_size("
-                      << I.first << ") ----> " << I.second->getData()->getName()
-                      << "\n");
+                      << std::distance(GVtoClusterMap.member_begin(*C),
+                                       GVtoClusterMap.member_end())
+                      << ") ----> " << C->getData()->getName() << "\n");
 
-    for (ClusterMapType::member_iterator MI =
-             GVtoClusterMap.findLeader(*I.second);
+    for (ClusterMapType::member_iterator MI = GVtoClusterMap.findLeader(*C);
          MI != GVtoClusterMap.member_end(); ++MI) {
       if (!Visited.insert(*MI).second)
         continue;
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
index ed4e0e411c7af..a965dc7ccdcad 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
@@ -50,7 +50,7 @@ def infer_complex_tempreg: GICombineRule <
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_match_1:          [dst, x]
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_apply_0:          [tmp, y]
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_apply_1:
-// CHECK-NEXT: Final Type Equivalence Classes: [tmp, dst, x, y]  [vec]
+// CHECK-NEXT: Final Type Equivalence Classes: [vec] [tmp, dst, x, y]
 // CHECK-NEXT: INFER: MachineOperand $tmp -> GITypeOf<$dst>
 // CHECK-NEXT: Apply patterns for rule infer_variadic_outs after inference:
 // CHECK-NEXT:   (CodeGenInstructionPattern name:__infer_variadic_outs_apply_0 G_FNEG operands:[<def>GITypeOf<$dst>:$tmp, $y])
diff --git a/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll b/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
index 92281e274adf0..a3fdb3cad94ca 100644
--- a/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
+++ b/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
@@ -5,8 +5,8 @@
 
 target datalayout = "e-p:64:64"
 
-; X64: @f = alias void (), ptr @[[JT0:.*]]
 ; X64: @g = alias void (), ptr @[[JT1:.*]]
+; X64: @f = alias void (), ptr @[[JT0:.*]]
 
 ; WASM32: private constant [0 x i8] zeroinitializer
 @0 = private unnamed_addr constant [2 x ptr] [ptr @f, ptr @g], align 16
@@ -30,20 +30,20 @@ declare i1 @llvm.type.test(ptr %ptr, metadata %bitset) nounwind readnone
 
 define i1 @foo(ptr %p) {
   ; X64: icmp eq i64 {{.*}}, ptrtoint (ptr @[[JT0]] to i64)
-  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 1) to i64)
+  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 2) to i64)
   %x = call i1 @llvm.type.test(ptr %p, metadata !"typeid1")
   ; X64: icmp eq i64 {{.*}}, ptrtoint (ptr @[[JT1]] to i64)
-  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 2) to i64)
+  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 1) to i64)
   %y = call i1 @llvm.type.test(ptr %p, metadata !"typeid2")
   %z = add i1 %x, %y
   ret i1 %z
 }
 
-; X64: define private void @[[JT0]]() #{{.*}} align 8 {
-; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @f.cfi)
-
 ; X64: define private void @[[JT1]]() #{{.*}} align 8 {
 ; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @g.cfi)
 
+; X64: define private void @[[JT0]]() #{{.*}} align 8 {
+; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @f.cfi)
+
 ; WASM32: ![[I0]] = !{i64 1}
 ; WASM32: ![[I1]] = !{i64 2}
diff --git a/llvm/test/Transforms/LowerTypeTests/nonstring.ll b/llvm/test/Transforms/LowerTypeTests/nonstring.ll
index ff8cc52d48344..ff7189aa0189c 100644
--- a/llvm/test/Transforms/LowerTypeTests/nonstring.ll
+++ b/llvm/test/Transforms/LowerTypeTests/nonstring.ll
@@ -4,8 +4,8 @@
 
 target datalayout = "e-p:32:32"
 
-; CHECK: @[[ANAME:.*]] = private constant { i32 }
 ; CHECK: @[[BNAME:.*]] = private constant { [2 x i32] }
+; CHECK: @[[ANAME:.*]] = private constant { i32 }
 
 @a = constant i32 1, !type !0
 @b = constant [2 x i32] [i32 2, i32 3], !type !1
diff --git a/llvm/test/tools/llvm-split/preserve-locals.ll b/llvm/test/tools/llvm-split/preserve-locals.ll
index d128daaf35dd7..ff0610bd65499 100644
--- a/llvm/test/tools/llvm-split/preserve-locals.ll
+++ b/llvm/test/tools/llvm-split/preserve-locals.ll
@@ -2,14 +2,15 @@
 ; RUN: llvm-dis -o - %t0 | FileCheck --check-prefix=CHECK0 %s
 ; RUN: llvm-dis -o - %t1 | FileCheck --check-prefix=CHECK1 %s
 
-; The local_var and local_func must not be separated.
-; CHECK0: @local_var
-; CHECK0: define internal fastcc void @local_func
 ; The main and a must not be separated.
 ; The main and local_func must not be together.
-; CHECK1: @a
-; CHECK1: define i32 @main
-; CHECK1: declare dso_local fastcc void @local_func
+; CHECK0: @a
+; CHECK0: define i32 @main
+; CHECK0: declare dso_local fastcc void @local_func
+
+; The local_var and local_func must not be separated.
+; CHECK1: @local_var
+; CHECK1: define internal fastcc void @local_func
 
 @a = internal global i32 0, align 4
 @global_storage = common global i32 0, align 4
diff --git a/llvm/test/tools/llvm-split/scc-const-alias.ll b/llvm/test/tools/llvm-split/scc-const-alias.ll
index 20670af416c44..9e66f38f50843 100644
--- a/llvm/test/tools/llvm-split/scc-const-alias.ll
+++ b/llvm/test/tools/llvm-split/scc-const-alias.ll
@@ -5,12 +5,12 @@
 ; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 %s
 
 ; Checks are not critical here - verifier will assert if we fail.
-; CHECK0: @g1 = global i32 99
-; CHECK0: @c1Alias = external global i8
-; CHECK0: @g1Alias = internal alias i8, ptr @g1
+; CHECK0: @g1 = external global i32
+; CHECK0: @c1Alias = internal alias i8, inttoptr (i64 42 to ptr)
 
-; CHECK1: @g1 = external global i32
-; CHECK1: @c1Alias = internal alias i8, inttoptr (i64 42 to ptr)
+; CHECK1: @g1 = global i32 99
+; CHECK1: @c1Alias = external global i8
+; CHECK1: @g1Alias = internal alias i8, ptr @g1
 
 ; Third file is actually empty.
 ; CHECK2: @g1 = external global i32
diff --git a/llvm/test/tools/llvm-split/scc-global-alias.ll b/llvm/test/tools/llvm-split/scc-global-alias.ll
index ee3b6a1c1ce1a..b3b52ccd535a0 100644
--- a/llvm/test/tools/llvm-split/scc-global-alias.ll
+++ b/llvm/test/tools/llvm-split/scc-global-alias.ll
@@ -5,16 +5,16 @@
 ; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 %s
 
 ; Checks are not critical here - verifier will assert if we fail.
-; CHECK0: @funInternal2Alias = alias
-; CHECK0: @funExternal2Alias = alias
-; CHECK0: define internal i32 @funInternal2
-; CHECK0: define i32 @funExternal2
+; CHECK0: @funInternalAlias = alias
+; CHECK0: define internal i32 @funInternal
 
-; CHECK1: @funInternalAlias = alias
-; CHECK1: define internal i32 @funInternal
+; CHECK1: @funExternalAlias = alias
+; CHECK1: define...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 2, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Florian Hahn (fhahn)

Changes

Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but close to noise:
https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&amp;to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&amp;stat=instructions


Patch is 21.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134075.diff

16 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp (+4-4)
  • (modified) llvm/include/llvm/ADT/EquivalenceClasses.h (+16-9)
  • (modified) llvm/lib/Analysis/VectorUtils.cpp (+4-4)
  • (modified) llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp (+3-2)
  • (modified) llvm/lib/Transforms/IPO/LowerTypeTests.cpp (+5-24)
  • (modified) llvm/lib/Transforms/Scalar/Float2Int.cpp (+6-4)
  • (modified) llvm/lib/Transforms/Scalar/LoopDistribute.cpp (+3-3)
  • (modified) llvm/lib/Transforms/Utils/SplitModule.cpp (+8-24)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td (+1-1)
  • (modified) llvm/test/Transforms/LowerTypeTests/function-disjoint.ll (+6-6)
  • (modified) llvm/test/Transforms/LowerTypeTests/nonstring.ll (+1-1)
  • (modified) llvm/test/tools/llvm-split/preserve-locals.ll (+7-6)
  • (modified) llvm/test/tools/llvm-split/scc-const-alias.ll (+5-5)
  • (modified) llvm/test/tools/llvm-split/scc-global-alias.ll (+8-8)
  • (modified) llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp (+2-1)
diff --git a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
index 69a90334c9df5..3c385ed8ef663 100644
--- a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
+++ b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
@@ -111,8 +111,8 @@ void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
     FalseAtoms = projectToLeaders(FalseAtoms, EquivalentAtoms);
 
     llvm::DenseMap<Atom, const Formula *> Substitutions;
-    for (auto It = EquivalentAtoms.begin(); It != EquivalentAtoms.end(); ++It) {
-      Atom TheAtom = It->getData();
+    for (const auto &E : EquivalentAtoms) {
+      Atom TheAtom = E->getData();
       Atom Leader = EquivalentAtoms.getLeaderValue(TheAtom);
       if (TrueAtoms.contains(Leader)) {
         if (FalseAtoms.contains(Leader)) {
@@ -152,9 +152,9 @@ void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
 
   if (Info) {
     for (const auto &E : EquivalentAtoms) {
-      if (!E.isLeader())
+      if (!E->isLeader())
         continue;
-      Atom At = *EquivalentAtoms.findLeader(E);
+      Atom At = *EquivalentAtoms.findLeader(*E);
       if (TrueAtoms.contains(At) || FalseAtoms.contains(At))
         continue;
       llvm::SmallVector<Atom> Atoms =
diff --git a/llvm/include/llvm/ADT/EquivalenceClasses.h b/llvm/include/llvm/ADT/EquivalenceClasses.h
index 46f186a71f5ce..906971baf74af 100644
--- a/llvm/include/llvm/ADT/EquivalenceClasses.h
+++ b/llvm/include/llvm/ADT/EquivalenceClasses.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_ADT_EQUIVALENCECLASSES_H
 #define LLVM_ADT_EQUIVALENCECLASSES_H
 
+#include "llvm/ADT/SmallVector.h"
 #include <cassert>
 #include <cstddef>
 #include <cstdint>
@@ -138,6 +139,9 @@ class EquivalenceClasses {
   /// ECValues, it just keeps the key as part of the value.
   std::set<ECValue, ECValueComparator> TheMapping;
 
+  /// List of all members, used to provide a determinstic iteration order.
+  SmallVector<const ECValue *> Members;
+
 public:
   EquivalenceClasses() = default;
   EquivalenceClasses(const EquivalenceClasses &RHS) {
@@ -146,9 +150,10 @@ class EquivalenceClasses {
 
   EquivalenceClasses &operator=(const EquivalenceClasses &RHS) {
     TheMapping.clear();
+    Members.clear();
     for (const auto &E : RHS)
-      if (E.isLeader()) {
-        member_iterator MI = RHS.member_begin(E);
+      if (E->isLeader()) {
+        member_iterator MI = RHS.member_begin(*E);
         member_iterator LeaderIt = member_begin(insert(*MI));
         for (++MI; MI != member_end(); ++MI)
           unionSets(LeaderIt, member_begin(insert(*MI)));
@@ -161,11 +166,10 @@ class EquivalenceClasses {
   //
 
   /// iterator* - Provides a way to iterate over all values in the set.
-  using iterator =
-      typename std::set<ECValue, ECValueComparator>::const_iterator;
+  using iterator = typename SmallVector<const ECValue *>::const_iterator;
 
-  iterator begin() const { return TheMapping.begin(); }
-  iterator end() const { return TheMapping.end(); }
+  iterator begin() const { return Members.begin(); }
+  iterator end() const { return Members.end(); }
 
   bool empty() const { return TheMapping.empty(); }
 
@@ -173,7 +177,7 @@ class EquivalenceClasses {
   class member_iterator;
   member_iterator member_begin(const ECValue &ECV) const {
     // Only leaders provide anything to iterate over.
-    return member_iterator(ECV.isLeader() ? ECV.getLeader() : nullptr);
+    return member_iterator(ECV.isLeader() ? &ECV : nullptr);
   }
 
   member_iterator member_end() const {
@@ -208,7 +212,7 @@ class EquivalenceClasses {
   unsigned getNumClasses() const {
     unsigned NC = 0;
     for (const auto &E : *this)
-      if (E.isLeader())
+      if (E->isLeader())
         ++NC;
     return NC;
   }
@@ -219,7 +223,10 @@ class EquivalenceClasses {
   /// insert - Insert a new value into the union/find set, ignoring the request
   /// if the value already exists.
   const ECValue &insert(const ElemTy &Data) {
-    return *TheMapping.insert(ECValue(Data)).first;
+    auto I = TheMapping.insert(ECValue(Data));
+    if (I.second)
+      Members.push_back(&*I.first);
+    return *I.first;
   }
 
   /// findLeader - Given a value in the set, return a member iterator for the
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index f57186589a325..663b961da848d 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -844,10 +844,10 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
         DBits[ECs.getOrInsertLeaderValue(I.first)] |= ~0ULL;
 
   for (const auto &E : ECs) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
     uint64_t LeaderDemandedBits = 0;
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end()))
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end()))
       LeaderDemandedBits |= DBits[M];
 
     uint64_t MinBW = llvm::bit_width(LeaderDemandedBits);
@@ -859,7 +859,7 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
     // indvars.
     // If we are required to shrink a PHI, abandon this entire equivalence class.
     bool Abort = false;
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end()))
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end()))
       if (isa<PHINode>(M) && MinBW < M->getType()->getScalarSizeInBits()) {
         Abort = true;
         break;
@@ -867,7 +867,7 @@ llvm::computeMinimumValueSizes(ArrayRef<BasicBlock *> Blocks, DemandedBits &DB,
     if (Abort)
       continue;
 
-    for (Value *M : make_range(ECs.member_begin(E), ECs.member_end())) {
+    for (Value *M : make_range(ECs.member_begin(*E), ECs.member_end())) {
       auto *MI = dyn_cast<Instruction>(M);
       if (!MI)
         continue;
diff --git a/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp b/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
index c6d40cb00b252..218cded84d76b 100644
--- a/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
+++ b/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
@@ -368,9 +368,9 @@ bool AArch64A57FPLoadBalancing::runOnBasicBlock(MachineBasicBlock &MBB) {
   // Convert the EquivalenceClasses to a simpler set of sets.
   std::vector<std::vector<Chain*> > V;
   for (const auto &E : EC) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
-    std::vector<Chain *> Cs(EC.member_begin(E), EC.member_end());
+    std::vector<Chain *> Cs(EC.member_begin(*E), EC.member_end());
     if (Cs.empty()) continue;
     V.push_back(std::move(Cs));
   }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 4a700bd213ed5..32472201cf9c2 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -1017,11 +1017,12 @@ void RecursiveSearchSplitting::setupWorkList() {
   }
 
   for (const auto &Node : NodeEC) {
-    if (!Node.isLeader())
+    if (!Node->isLeader())
       continue;
 
     BitVector Cluster = SG.createNodesBitVector();
-    for (auto MI = NodeEC.member_begin(Node); MI != NodeEC.member_end(); ++MI) {
+    for (auto MI = NodeEC.member_begin(*Node); MI != NodeEC.member_end();
+         ++MI) {
       const SplitGraph::Node &N = SG.getNode(*MI);
       if (N.isGraphEntryPoint())
         N.getDependencies(Cluster);
diff --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
index 11f9f0271395b..fcd8918f1d9d7 100644
--- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
@@ -2339,36 +2339,17 @@ bool LowerTypeTestsModule::lower() {
   if (GlobalClasses.empty())
     return false;
 
-  // Build a list of disjoint sets ordered by their maximum global index for
-  // determinism.
-  std::vector<std::pair<GlobalClassesTy::iterator, unsigned>> Sets;
-  for (GlobalClassesTy::iterator I = GlobalClasses.begin(),
-                                 E = GlobalClasses.end();
-       I != E; ++I) {
-    if (!I->isLeader())
+  // For each disjoint set we found...
+  for (const auto &C : GlobalClasses) {
+    if (!C->isLeader())
       continue;
-    ++NumTypeIdDisjointSets;
-
-    unsigned MaxUniqueId = 0;
-    for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(*I);
-         MI != GlobalClasses.member_end(); ++MI) {
-      if (auto *MD = dyn_cast_if_present<Metadata *>(*MI))
-        MaxUniqueId = std::max(MaxUniqueId, TypeIdInfo[MD].UniqueId);
-      else if (auto *BF = dyn_cast_if_present<ICallBranchFunnel *>(*MI))
-        MaxUniqueId = std::max(MaxUniqueId, BF->UniqueId);
-    }
-    Sets.emplace_back(I, MaxUniqueId);
-  }
-  llvm::sort(Sets, llvm::less_second());
 
-  // For each disjoint set we found...
-  for (const auto &S : Sets) {
+    ++NumTypeIdDisjointSets;
     // Build the list of type identifiers in this disjoint set.
     std::vector<Metadata *> TypeIds;
     std::vector<GlobalTypeMember *> Globals;
     std::vector<ICallBranchFunnel *> ICallBranchFunnels;
-    for (GlobalClassesTy::member_iterator MI =
-             GlobalClasses.member_begin(*S.first);
+    for (GlobalClassesTy::member_iterator MI = GlobalClasses.member_begin(*C);
          MI != GlobalClasses.member_end(); ++MI) {
       if (isa<Metadata *>(*MI))
         TypeIds.push_back(cast<Metadata *>(*MI));
diff --git a/llvm/lib/Transforms/Scalar/Float2Int.cpp b/llvm/lib/Transforms/Scalar/Float2Int.cpp
index 85c376c564d35..927877b3135e5 100644
--- a/llvm/lib/Transforms/Scalar/Float2Int.cpp
+++ b/llvm/lib/Transforms/Scalar/Float2Int.cpp
@@ -312,14 +312,16 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
 
   // Iterate over every disjoint partition of the def-use graph.
   for (const auto &E : ECs) {
-    if (!E.isLeader())
+    if (!E->isLeader())
       continue;
+
     ConstantRange R(MaxIntegerBW + 1, false);
     bool Fail = false;
     Type *ConvertedToTy = nullptr;
 
     // For every member of the partition, union all the ranges together.
-    for (auto MI = ECs.member_begin(E), ME = ECs.member_end(); MI != ME; ++MI) {
+    for (auto MI = ECs.member_begin(*E), ME = ECs.member_end(); MI != ME;
+         ++MI) {
       Instruction *I = *MI;
       auto SeenI = SeenInsts.find(I);
       if (SeenI == SeenInsts.end())
@@ -349,7 +351,7 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
 
     // If the set was empty, or we failed, or the range is poisonous,
     // bail out.
-    if (ECs.member_begin(E) == ECs.member_end() || Fail || R.isFullSet() ||
+    if (ECs.member_begin(*E) == ECs.member_end() || Fail || R.isFullSet() ||
         R.isSignWrappedSet())
       continue;
     assert(ConvertedToTy && "Must have set the convertedtoty by this point!");
@@ -389,7 +391,7 @@ bool Float2IntPass::validateAndTransform(const DataLayout &DL) {
       }
     }
 
-    for (auto MI = ECs.member_begin(E), ME = ECs.member_end(); MI != ME; ++MI)
+    for (auto MI = ECs.member_begin(*E), ME = ECs.member_end(); MI != ME; ++MI)
       convert(*MI, Ty);
     MadeChange = true;
   }
diff --git a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
index 5f03d854b51e6..e10d0c0defd96 100644
--- a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
@@ -386,11 +386,11 @@ class InstPartitionContainer {
     // Merge the member of an equivalence class into its class leader.  This
     // makes the members empty.
     for (const auto &C : ToBeMerged) {
-      if (!C.isLeader())
+      if (!C->isLeader())
         continue;
 
-      auto PartI = C.getData();
-      for (auto *PartJ : make_range(std::next(ToBeMerged.member_begin(C)),
+      auto PartI = C->getData();
+      for (auto *PartJ : make_range(std::next(ToBeMerged.member_begin(*C)),
                                     ToBeMerged.member_end())) {
         PartJ->moveTo(*PartI);
       }
diff --git a/llvm/lib/Transforms/Utils/SplitModule.cpp b/llvm/lib/Transforms/Utils/SplitModule.cpp
index 507e3c7a42737..c39771733ee0d 100644
--- a/llvm/lib/Transforms/Utils/SplitModule.cpp
+++ b/llvm/lib/Transforms/Utils/SplitModule.cpp
@@ -170,40 +170,24 @@ static void findPartitions(Module &M, ClusterIDMapType &ClusterIDMap,
   for (unsigned i = 0; i < N; ++i)
     BalancingQueue.push(std::make_pair(i, 0));
 
-  using SortType = std::pair<unsigned, ClusterMapType::iterator>;
-
-  SmallVector<SortType, 64> Sets;
   SmallPtrSet<const GlobalValue *, 32> Visited;
 
   // To guarantee determinism, we have to sort SCC according to size.
   // When size is the same, use leader's name.
-  for (ClusterMapType::iterator I = GVtoClusterMap.begin(),
-                                E = GVtoClusterMap.end();
-       I != E; ++I)
-    if (I->isLeader())
-      Sets.push_back(
-          std::make_pair(std::distance(GVtoClusterMap.member_begin(*I),
-                                       GVtoClusterMap.member_end()),
-                         I));
-
-  llvm::sort(Sets, [](const SortType &a, const SortType &b) {
-    if (a.first == b.first)
-      return a.second->getData()->getName() > b.second->getData()->getName();
-    else
-      return a.first > b.first;
-  });
-
-  for (auto &I : Sets) {
+  for (const auto &C : GVtoClusterMap) {
+    if (!C->isLeader())
+      continue;
+
     unsigned CurrentClusterID = BalancingQueue.top().first;
     unsigned CurrentClusterSize = BalancingQueue.top().second;
     BalancingQueue.pop();
 
     LLVM_DEBUG(dbgs() << "Root[" << CurrentClusterID << "] cluster_size("
-                      << I.first << ") ----> " << I.second->getData()->getName()
-                      << "\n");
+                      << std::distance(GVtoClusterMap.member_begin(*C),
+                                       GVtoClusterMap.member_end())
+                      << ") ----> " << C->getData()->getName() << "\n");
 
-    for (ClusterMapType::member_iterator MI =
-             GVtoClusterMap.findLeader(*I.second);
+    for (ClusterMapType::member_iterator MI = GVtoClusterMap.findLeader(*C);
          MI != GVtoClusterMap.member_end(); ++MI) {
       if (!Visited.insert(*MI).second)
         continue;
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
index ed4e0e411c7af..a965dc7ccdcad 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
@@ -50,7 +50,7 @@ def infer_complex_tempreg: GICombineRule <
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_match_1:          [dst, x]
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_apply_0:          [tmp, y]
 // CHECK-NEXT:         Groups for  __infer_variadic_outs_apply_1:
-// CHECK-NEXT: Final Type Equivalence Classes: [tmp, dst, x, y]  [vec]
+// CHECK-NEXT: Final Type Equivalence Classes: [vec] [tmp, dst, x, y]
 // CHECK-NEXT: INFER: MachineOperand $tmp -> GITypeOf<$dst>
 // CHECK-NEXT: Apply patterns for rule infer_variadic_outs after inference:
 // CHECK-NEXT:   (CodeGenInstructionPattern name:__infer_variadic_outs_apply_0 G_FNEG operands:[<def>GITypeOf<$dst>:$tmp, $y])
diff --git a/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll b/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
index 92281e274adf0..a3fdb3cad94ca 100644
--- a/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
+++ b/llvm/test/Transforms/LowerTypeTests/function-disjoint.ll
@@ -5,8 +5,8 @@
 
 target datalayout = "e-p:64:64"
 
-; X64: @f = alias void (), ptr @[[JT0:.*]]
 ; X64: @g = alias void (), ptr @[[JT1:.*]]
+; X64: @f = alias void (), ptr @[[JT0:.*]]
 
 ; WASM32: private constant [0 x i8] zeroinitializer
 @0 = private unnamed_addr constant [2 x ptr] [ptr @f, ptr @g], align 16
@@ -30,20 +30,20 @@ declare i1 @llvm.type.test(ptr %ptr, metadata %bitset) nounwind readnone
 
 define i1 @foo(ptr %p) {
   ; X64: icmp eq i64 {{.*}}, ptrtoint (ptr @[[JT0]] to i64)
-  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 1) to i64)
+  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 2) to i64)
   %x = call i1 @llvm.type.test(ptr %p, metadata !"typeid1")
   ; X64: icmp eq i64 {{.*}}, ptrtoint (ptr @[[JT1]] to i64)
-  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 2) to i64)
+  ; WASM32: icmp eq i64 {{.*}}, ptrtoint (ptr getelementptr (i8, ptr null, i64 1) to i64)
   %y = call i1 @llvm.type.test(ptr %p, metadata !"typeid2")
   %z = add i1 %x, %y
   ret i1 %z
 }
 
-; X64: define private void @[[JT0]]() #{{.*}} align 8 {
-; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @f.cfi)
-
 ; X64: define private void @[[JT1]]() #{{.*}} align 8 {
 ; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @g.cfi)
 
+; X64: define private void @[[JT0]]() #{{.*}} align 8 {
+; X64:   call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr @f.cfi)
+
 ; WASM32: ![[I0]] = !{i64 1}
 ; WASM32: ![[I1]] = !{i64 2}
diff --git a/llvm/test/Transforms/LowerTypeTests/nonstring.ll b/llvm/test/Transforms/LowerTypeTests/nonstring.ll
index ff8cc52d48344..ff7189aa0189c 100644
--- a/llvm/test/Transforms/LowerTypeTests/nonstring.ll
+++ b/llvm/test/Transforms/LowerTypeTests/nonstring.ll
@@ -4,8 +4,8 @@
 
 target datalayout = "e-p:32:32"
 
-; CHECK: @[[ANAME:.*]] = private constant { i32 }
 ; CHECK: @[[BNAME:.*]] = private constant { [2 x i32] }
+; CHECK: @[[ANAME:.*]] = private constant { i32 }
 
 @a = constant i32 1, !type !0
 @b = constant [2 x i32] [i32 2, i32 3], !type !1
diff --git a/llvm/test/tools/llvm-split/preserve-locals.ll b/llvm/test/tools/llvm-split/preserve-locals.ll
index d128daaf35dd7..ff0610bd65499 100644
--- a/llvm/test/tools/llvm-split/preserve-locals.ll
+++ b/llvm/test/tools/llvm-split/preserve-locals.ll
@@ -2,14 +2,15 @@
 ; RUN: llvm-dis -o - %t0 | FileCheck --check-prefix=CHECK0 %s
 ; RUN: llvm-dis -o - %t1 | FileCheck --check-prefix=CHECK1 %s
 
-; The local_var and local_func must not be separated.
-; CHECK0: @local_var
-; CHECK0: define internal fastcc void @local_func
 ; The main and a must not be separated.
 ; The main and local_func must not be together.
-; CHECK1: @a
-; CHECK1: define i32 @main
-; CHECK1: declare dso_local fastcc void @local_func
+; CHECK0: @a
+; CHECK0: define i32 @main
+; CHECK0: declare dso_local fastcc void @local_func
+
+; The local_var and local_func must not be separated.
+; CHECK1: @local_var
+; CHECK1: define internal fastcc void @local_func
 
 @a = internal global i32 0, align 4
 @global_storage = common global i32 0, align 4
diff --git a/llvm/test/tools/llvm-split/scc-const-alias.ll b/llvm/test/tools/llvm-split/scc-const-alias.ll
index 20670af416c44..9e66f38f50843 100644
--- a/llvm/test/tools/llvm-split/scc-const-alias.ll
+++ b/llvm/test/tools/llvm-split/scc-const-alias.ll
@@ -5,12 +5,12 @@
 ; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 %s
 
 ; Checks are not critical here - verifier will assert if we fail.
-; CHECK0: @g1 = global i32 99
-; CHECK0: @c1Alias = external global i8
-; CHECK0: @g1Alias = internal alias i8, ptr @g1
+; CHECK0: @g1 = external global i32
+; CHECK0: @c1Alias = internal alias i8, inttoptr (i64 42 to ptr)
 
-; CHECK1: @g1 = external global i32
-; CHECK1: @c1Alias = internal alias i8, inttoptr (i64 42 to ptr)
+; CHECK1: @g1 = global i32 99
+; CHECK1: @c1Alias = external global i8
+; CHECK1: @g1Alias = internal alias i8, ptr @g1
 
 ; Third file is actually empty.
 ; CHECK2: @g1 = external global i32
diff --git a/llvm/test/tools/llvm-split/scc-global-alias.ll b/llvm/test/tools/llvm-split/scc-global-alias.ll
index ee3b6a1c1ce1a..b3b52ccd535a0 100644
--- a/llvm/test/tools/llvm-split/scc-global-alias.ll
+++ b/llvm/test/tools/llvm-split/scc-global-alias.ll
@@ -5,16 +5,16 @@
 ; RUN: llvm-dis -o - %t2 | FileCheck --check-prefix=CHECK2 %s
 
 ; Checks are not critical here - verifier will assert if we fail.
-; CHECK0: @funInternal2Alias = alias
-; CHECK0: @funExternal2Alias = alias
-; CHECK0: define internal i32 @funInternal2
-; CHECK0: define i32 @funExternal2
+; CHECK0: @funInternalAlias = alias
+; CHECK0: define internal i32 @funInternal
 
-; CHECK1: @funInternalAlias = alias
-; CHECK1: define internal i32 @funInternal
+; CHECK1: @funExternalAlias = alias
+; CHECK1: define...
[truncated]

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. We could keep the existing API with a custom iterator, but given the limited amount of uses this seems ok.

Comment on lines +142 to +143
/// List of all members, used to provide a determinstic iteration order.
SmallVector<const ECValue *> Members;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you combine into SetVector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where things get a little tricky. ECValue contain pointers to other ECValue's (the leader of the class and the next member of the class), so using something like SetVector as the backing storage won't work I think, as it might reallocate and move the objects.

One thing I tried is using a BumpAllocator for the elements in the class and a separate DenseMap instead of the std::set: 5c13864

I think SetVector (or MapVector) would be slightly worse here, as it stores an ECValue in both a vector and a denseSet, and ECValue contains 3 pointers, so it would require quite a bit of extra memory .

The patch could use a DenseSet, it just requires a custom DenseMapInfo for ECValue to just hash/compare the data part. I can do that if desired and prepare the patch for review

Compile-time wise this looks neutral, but it would simplify things slightly https://llvm-compile-time-tracker.com/compare.php?from=b0c2ac67a88d3ef86987e2f82115ea0170675a17&to=5c13864badfd7170c9c7c0a908e3cba8aa0f4fea&stat=instructions:u

(the patch just has the unit tests for EQClass removed, because one of the keys doesn't have DenseMapInfo).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared it here #134264, not sure if it is a huge improvement though

fhahn added 2 commits April 2, 2025 16:00
Currently iterators over EquivalenceClasses will iterate over std::set,
which guarantees the order specified by the comperator. Unfortunately in
many cases, EquivalenceClasses are used with pointers, so iterating over
std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence
classes before using them to try to get a deterministic order
(LowerTypeTests, SplitModule), but there are others that do not at the
moment and this can result at least in non-determinstic value naming in
Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a
extra SmallVector and removes code from LowerTypeTests and SplitModule
to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but
close to noise:
https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions
@fhahn fhahn force-pushed the eq-classes-deterministic-iter branch from ab1993c to 3e50db9 Compare April 2, 2025 15:15
@fhahn fhahn merged commit 3bdf9a0 into llvm:main Apr 2, 2025
12 checks passed
@fhahn fhahn deleted the eq-classes-deterministic-iter branch April 2, 2025 19:27
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 2, 2025
…ration order. (#134075)

Currently iterators over EquivalenceClasses will iterate over std::set,
which guarantees the order specified by the comperator. Unfortunately in
many cases, EquivalenceClasses are used with pointers, so iterating over
std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence
classes before using them to try to get a deterministic order
(LowerTypeTests, SplitModule), but there are others that do not at the
moment and this can result at least in non-determinstic value naming in
Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a
extra SmallVector and removes code from LowerTypeTests and SplitModule
to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but
close to noise:

https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions

PR: llvm/llvm-project#134075
fhahn added a commit that referenced this pull request Apr 5, 2025
Replace the std::set with DenseMap, which removes the requirement for an
ordering predicate. This also requires to allocate the ECValue objects
separately. This patch uses a BumpPtrAllocator.

Follow-up to #134075.

Compile-time impact is mostly neutral or slightly positive:

https://llvm-compile-time-tracker.com/compare.php?from=ee4e8197fa67dd1ed6e9470e00708e7feeaacd97&to=242e6a8e42889eebfc0bb5d433a4de7dd9e224a7&stat=instructions:u
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 5, 2025
…C) (#134264)

Replace the std::set with DenseMap, which removes the requirement for an
ordering predicate. This also requires to allocate the ECValue objects
separately. This patch uses a BumpPtrAllocator.

Follow-up to llvm/llvm-project#134075.

Compile-time impact is mostly neutral or slightly positive:

https://llvm-compile-time-tracker.com/compare.php?from=ee4e8197fa67dd1ed6e9470e00708e7feeaacd97&to=242e6a8e42889eebfc0bb5d433a4de7dd9e224a7&stat=instructions:u
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 backend:AMDGPU backend:WebAssembly clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category llvm:adt llvm:analysis Includes value tracking, cost tables and constant folding llvm:globalisel llvm:transforms tablegen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants