Skip to content

BalancedPartitioning: minor updates #77568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 17, 2024
Merged

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Jan 10, 2024

When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1). Fix
this and change floor+log2 to Log2_64.

BalancedPartitioning::bisect can use unstable sort because Nodes
contains distinct InputOrderIndexs.

BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.

@MaskRay MaskRay requested a review from ellishg January 10, 2024 09:06
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:support labels Jan 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 10, 2024

@llvm/pr-subscribers-pgo

Author: Fangrui Song (MaskRay)

Changes

When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1). Fix
this and change floor+log2 to Log2_64.

BalancedPartitioning::bisect can use unstable sort because Nodes
contains distinct InputOrderIndexs.

BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.


Full diff: https://github.com/llvm/llvm-project/pull/77568.diff

3 Files Affected:

  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+5-4)
  • (modified) llvm/lib/Support/BalancedPartitioning.cpp (+6-10)
  • (modified) llvm/unittests/ProfileData/BPFunctionNodeTest.cpp (+12-5)
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 4264da8ad75143..6016fcb3551d11 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -13,6 +13,7 @@
 
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/Support/MathExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
@@ -921,7 +922,7 @@ std::vector<BPFunctionNode> TemporalProfTraceTy::createBPFunctionNodes(
       if (Timestamp < Trace.FunctionNameRefs.size())
         FunctionIds.insert(Trace.FunctionNameRefs[Timestamp]);
 
-  int N = std::ceil(std::log2(LargestTraceSize));
+  const int N = Log2_64(LargestTraceSize) + 1;
 
   // TODO: We need to use the Trace.Weight field to give more weight to more
   // important utilities
@@ -929,8 +930,8 @@ std::vector<BPFunctionNode> TemporalProfTraceTy::createBPFunctionNodes(
   for (size_t TraceIdx = 0; TraceIdx < Traces.size(); TraceIdx++) {
     auto &Trace = Traces[TraceIdx].FunctionNameRefs;
     for (size_t Timestamp = 0; Timestamp < Trace.size(); Timestamp++) {
-      for (int I = std::floor(std::log2(Timestamp + 1)); I < N; I++) {
-        auto &FunctionId = Trace[Timestamp];
+      for (int I = Log2_64(Timestamp + 1); I < N; I++) {
+        auto FunctionId = Trace[Timestamp];
         UtilityNodeT GroupId = TraceIdx * N + I;
         FuncGroups[FunctionId].push_back(GroupId);
       }
@@ -938,7 +939,7 @@ std::vector<BPFunctionNode> TemporalProfTraceTy::createBPFunctionNodes(
   }
 
   std::vector<BPFunctionNode> Nodes;
-  for (auto &Id : FunctionIds) {
+  for (auto Id : FunctionIds) {
     auto &UNs = FuncGroups[Id];
     llvm::sort(UNs);
     UNs.erase(std::unique(UNs.begin(), UNs.end()), UNs.end());
diff --git a/llvm/lib/Support/BalancedPartitioning.cpp b/llvm/lib/Support/BalancedPartitioning.cpp
index 5843be94991151..cb6ba61179941f 100644
--- a/llvm/lib/Support/BalancedPartitioning.cpp
+++ b/llvm/lib/Support/BalancedPartitioning.cpp
@@ -117,7 +117,7 @@ void BalancedPartitioning::bisect(const FunctionNodeRange Nodes,
   if (NumNodes <= 1 || RecDepth >= Config.SplitDepth) {
     // We've reach the lowest level of the recursion tree. Fall back to the
     // original order and assign to buckets.
-    llvm::stable_sort(Nodes, [](const auto &L, const auto &R) {
+    llvm::sort(Nodes, [](const auto &L, const auto &R) {
       return L.InputOrderIndex < R.InputOrderIndex;
     });
     for (auto &N : Nodes)
@@ -167,26 +167,22 @@ void BalancedPartitioning::runIterations(const FunctionNodeRange Nodes,
                                          unsigned RightBucket,
                                          std::mt19937 &RNG) const {
   unsigned NumNodes = std::distance(Nodes.begin(), Nodes.end());
-  DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeDegree;
+  DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeIndex;
   for (auto &N : Nodes)
     for (auto &UN : N.UtilityNodes)
-      ++UtilityNodeDegree[UN];
+      ++UtilityNodeIndex[UN];
   // Remove utility nodes if they have just one edge or are connected to all
   // functions
   for (auto &N : Nodes)
     llvm::erase_if(N.UtilityNodes, [&](auto &UN) {
-      return UtilityNodeDegree[UN] <= 1 || UtilityNodeDegree[UN] >= NumNodes;
+      return UtilityNodeIndex[UN] == 1 || UtilityNodeIndex[UN] == NumNodes;
     });
 
   // Renumber utility nodes so they can be used to index into Signatures
-  DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeIndex;
-  for (auto &N : Nodes)
-    for (auto &UN : N.UtilityNodes)
-      if (!UtilityNodeIndex.count(UN))
-        UtilityNodeIndex[UN] = UtilityNodeIndex.size();
+  UtilityNodeIndex.clear();
   for (auto &N : Nodes)
     for (auto &UN : N.UtilityNodes)
-      UN = UtilityNodeIndex[UN];
+      UN = UtilityNodeIndex.insert({UN, UtilityNodeIndex.size()}).first->second;
 
   // Initialize signatures
   SignaturesT Signatures(/*Size=*/UtilityNodeIndex.size());
diff --git a/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp b/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp
index 97ff5c1edf5efc..6af6f1bcdc40a8 100644
--- a/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp
+++ b/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp
@@ -24,11 +24,6 @@ void PrintTo(const BPFunctionNode &Node, std::ostream *OS) {
 }
 
 TEST(BPFunctionNodeTest, Basic) {
-  auto Nodes = TemporalProfTraceTy::createBPFunctionNodes({
-      TemporalProfTraceTy({0, 1, 2, 3, 4}),
-      TemporalProfTraceTy({4, 2}),
-  });
-
   auto NodeIs = [](BPFunctionNode::IDT Id,
                    ArrayRef<BPFunctionNode::UtilityNodeT> UNs) {
     return AllOf(Field("Id", &BPFunctionNode::Id, Id),
@@ -36,6 +31,18 @@ TEST(BPFunctionNodeTest, Basic) {
                        UnorderedElementsAreArray(UNs)));
   };
 
+  auto Nodes = TemporalProfTraceTy::createBPFunctionNodes({
+      TemporalProfTraceTy({0, 1, 2, 3}),
+  });
+  EXPECT_THAT(Nodes,
+              UnorderedElementsAre(NodeIs(0, {0, 1, 2}), NodeIs(1, {1, 2}),
+                                   NodeIs(2, {1, 2}), NodeIs(3, {2})));
+
+  Nodes = TemporalProfTraceTy::createBPFunctionNodes({
+      TemporalProfTraceTy({0, 1, 2, 3, 4}),
+      TemporalProfTraceTy({4, 2}),
+  });
+
   EXPECT_THAT(Nodes,
               UnorderedElementsAre(NodeIs(0, {0, 1, 2}), NodeIs(1, {1, 2}),
                                    NodeIs(2, {1, 2, 4, 5}), NodeIs(3, {2}),

@llvmbot
Copy link
Member

llvmbot commented Jan 10, 2024

@llvm/pr-subscribers-llvm-support

Author: Fangrui Song (MaskRay)

Changes

When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1). Fix
this and change floor+log2 to Log2_64.

BalancedPartitioning::bisect can use unstable sort because Nodes
contains distinct InputOrderIndexs.

BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.


Full diff: https://github.com/llvm/llvm-project/pull/77568.diff

3 Files Affected:

  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+5-4)
  • (modified) llvm/lib/Support/BalancedPartitioning.cpp (+6-10)
  • (modified) llvm/unittests/ProfileData/BPFunctionNodeTest.cpp (+12-5)
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 4264da8ad75143..6016fcb3551d11 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -13,6 +13,7 @@
 
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/Support/MathExtras.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
@@ -921,7 +922,7 @@ std::vector<BPFunctionNode> TemporalProfTraceTy::createBPFunctionNodes(
       if (Timestamp < Trace.FunctionNameRefs.size())
         FunctionIds.insert(Trace.FunctionNameRefs[Timestamp]);
 
-  int N = std::ceil(std::log2(LargestTraceSize));
+  const int N = Log2_64(LargestTraceSize) + 1;
 
   // TODO: We need to use the Trace.Weight field to give more weight to more
   // important utilities
@@ -929,8 +930,8 @@ std::vector<BPFunctionNode> TemporalProfTraceTy::createBPFunctionNodes(
   for (size_t TraceIdx = 0; TraceIdx < Traces.size(); TraceIdx++) {
     auto &Trace = Traces[TraceIdx].FunctionNameRefs;
     for (size_t Timestamp = 0; Timestamp < Trace.size(); Timestamp++) {
-      for (int I = std::floor(std::log2(Timestamp + 1)); I < N; I++) {
-        auto &FunctionId = Trace[Timestamp];
+      for (int I = Log2_64(Timestamp + 1); I < N; I++) {
+        auto FunctionId = Trace[Timestamp];
         UtilityNodeT GroupId = TraceIdx * N + I;
         FuncGroups[FunctionId].push_back(GroupId);
       }
@@ -938,7 +939,7 @@ std::vector<BPFunctionNode> TemporalProfTraceTy::createBPFunctionNodes(
   }
 
   std::vector<BPFunctionNode> Nodes;
-  for (auto &Id : FunctionIds) {
+  for (auto Id : FunctionIds) {
     auto &UNs = FuncGroups[Id];
     llvm::sort(UNs);
     UNs.erase(std::unique(UNs.begin(), UNs.end()), UNs.end());
diff --git a/llvm/lib/Support/BalancedPartitioning.cpp b/llvm/lib/Support/BalancedPartitioning.cpp
index 5843be94991151..cb6ba61179941f 100644
--- a/llvm/lib/Support/BalancedPartitioning.cpp
+++ b/llvm/lib/Support/BalancedPartitioning.cpp
@@ -117,7 +117,7 @@ void BalancedPartitioning::bisect(const FunctionNodeRange Nodes,
   if (NumNodes <= 1 || RecDepth >= Config.SplitDepth) {
     // We've reach the lowest level of the recursion tree. Fall back to the
     // original order and assign to buckets.
-    llvm::stable_sort(Nodes, [](const auto &L, const auto &R) {
+    llvm::sort(Nodes, [](const auto &L, const auto &R) {
       return L.InputOrderIndex < R.InputOrderIndex;
     });
     for (auto &N : Nodes)
@@ -167,26 +167,22 @@ void BalancedPartitioning::runIterations(const FunctionNodeRange Nodes,
                                          unsigned RightBucket,
                                          std::mt19937 &RNG) const {
   unsigned NumNodes = std::distance(Nodes.begin(), Nodes.end());
-  DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeDegree;
+  DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeIndex;
   for (auto &N : Nodes)
     for (auto &UN : N.UtilityNodes)
-      ++UtilityNodeDegree[UN];
+      ++UtilityNodeIndex[UN];
   // Remove utility nodes if they have just one edge or are connected to all
   // functions
   for (auto &N : Nodes)
     llvm::erase_if(N.UtilityNodes, [&](auto &UN) {
-      return UtilityNodeDegree[UN] <= 1 || UtilityNodeDegree[UN] >= NumNodes;
+      return UtilityNodeIndex[UN] == 1 || UtilityNodeIndex[UN] == NumNodes;
     });
 
   // Renumber utility nodes so they can be used to index into Signatures
-  DenseMap<BPFunctionNode::UtilityNodeT, unsigned> UtilityNodeIndex;
-  for (auto &N : Nodes)
-    for (auto &UN : N.UtilityNodes)
-      if (!UtilityNodeIndex.count(UN))
-        UtilityNodeIndex[UN] = UtilityNodeIndex.size();
+  UtilityNodeIndex.clear();
   for (auto &N : Nodes)
     for (auto &UN : N.UtilityNodes)
-      UN = UtilityNodeIndex[UN];
+      UN = UtilityNodeIndex.insert({UN, UtilityNodeIndex.size()}).first->second;
 
   // Initialize signatures
   SignaturesT Signatures(/*Size=*/UtilityNodeIndex.size());
diff --git a/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp b/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp
index 97ff5c1edf5efc..6af6f1bcdc40a8 100644
--- a/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp
+++ b/llvm/unittests/ProfileData/BPFunctionNodeTest.cpp
@@ -24,11 +24,6 @@ void PrintTo(const BPFunctionNode &Node, std::ostream *OS) {
 }
 
 TEST(BPFunctionNodeTest, Basic) {
-  auto Nodes = TemporalProfTraceTy::createBPFunctionNodes({
-      TemporalProfTraceTy({0, 1, 2, 3, 4}),
-      TemporalProfTraceTy({4, 2}),
-  });
-
   auto NodeIs = [](BPFunctionNode::IDT Id,
                    ArrayRef<BPFunctionNode::UtilityNodeT> UNs) {
     return AllOf(Field("Id", &BPFunctionNode::Id, Id),
@@ -36,6 +31,18 @@ TEST(BPFunctionNodeTest, Basic) {
                        UnorderedElementsAreArray(UNs)));
   };
 
+  auto Nodes = TemporalProfTraceTy::createBPFunctionNodes({
+      TemporalProfTraceTy({0, 1, 2, 3}),
+  });
+  EXPECT_THAT(Nodes,
+              UnorderedElementsAre(NodeIs(0, {0, 1, 2}), NodeIs(1, {1, 2}),
+                                   NodeIs(2, {1, 2}), NodeIs(3, {2})));
+
+  Nodes = TemporalProfTraceTy::createBPFunctionNodes({
+      TemporalProfTraceTy({0, 1, 2, 3, 4}),
+      TemporalProfTraceTy({4, 2}),
+  });
+
   EXPECT_THAT(Nodes,
               UnorderedElementsAre(NodeIs(0, {0, 1, 2}), NodeIs(1, {1, 2}),
                                    NodeIs(2, {1, 2, 4, 5}), NodeIs(3, {2}),

Copy link

github-actions bot commented Jan 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@ellishg ellishg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for adding another test

When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1). Fix
this and change floor+log2 to Log2_64.

BalancedPartitioning::bisect can use unstable sort because `Nodes`
contains distinct `InputOrderIndex`s.

BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.
@MaskRay MaskRay force-pushed the createBPFunctionNodes branch from 56a21a6 to 517b26f Compare January 17, 2024 18:43
@MaskRay MaskRay merged commit 0c6dc80 into llvm:main Jan 17, 2024
@MaskRay MaskRay deleted the createBPFunctionNodes branch January 17, 2024 18:46
ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1).
Fix
this and change floor+log2 to Log2_64.

BalancedPartitioning::bisect can use unstable sort because `Nodes`
contains distinct `InputOrderIndex`s.

BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:support PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants