Skip to content

[InstrRef][NFC] Avoid un-necessary DenseMap queries #99048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 17, 2024

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Jul 16, 2024

This patch adjusts how some data is stored to avoid a number of un-necessary DenseMap queries. There's no change to the compiler behaviour, and it's measurably faster on the compile time tracker.

The BlockOrders vector in buildVLocValueMap collects the blocks over which a variables value have to be determined: however the Cmp ordering function makes two DenseMap queries to determine the RPO-order of blocks being compared. And given that sorting is O(N log(N)) comparisons this isn't fast. So instead, fetch the RPO-numbers of the block collection, order those, and then map back to the blocks themselves.

The OrderToBB collection mapped an RPO-number to an MBB: it's completely un-necessary to have DenseMap here, we can just use the RPO number as an array index. Switch it to a SmallVector and deal with a few consequences when iterating.

(And for good measure I've jammed in a couple of reserve calls).

This patch adjusts how some data is stored to avoid a number of
un-necessary DenseMap queries. There's no change to the compiler behaviour,
and it's measurably faster on the compile time tracker.

The BlockOrders vector in buildVLocValueMap collects the blocks over which
a variables value have to be determined: however the Cmp ordering function
makes two DenseMap queries to determine the RPO-order of blocks being
compared. And given that sorting is O(N log(N)) comparisons this isn't
fast. So instead, fetch the RPO-numbers of the block collection, order
those, and then map back to the blocks themselves.

The OrderToBB collection mapped an RPO-number to an MBB: it's completely
un-necessary to have DenseMap here, we can just use the RPO number as an
array index. Switch it to a SmallVector and deal with a few consequences
when iterating.

(And for good measure I've jammed in a couple of reserve calls).
@jmorse jmorse requested review from SLTozer and OCHyams July 16, 2024 14:06
@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2024

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

This patch adjusts how some data is stored to avoid a number of un-necessary DenseMap queries. There's no change to the compiler behaviour, and it's measurably faster on the compile time tracker.

The BlockOrders vector in buildVLocValueMap collects the blocks over which a variables value have to be determined: however the Cmp ordering function makes two DenseMap queries to determine the RPO-order of blocks being compared. And given that sorting is O(N log(N)) comparisons this isn't fast. So instead, fetch the RPO-numbers of the block collection, order those, and then map back to the blocks themselves.

The OrderToBB collection mapped an RPO-number to an MBB: it's completely un-necessary to have DenseMap here, we can just use the RPO number as an array index. Switch it to a SmallVector and deal with a few consequences when iterating.

(And for good measure I've jammed in a couple of reserve calls).


Full diff: https://github.com/llvm/llvm-project/pull/99048.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp (+26-18)
  • (modified) llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h (+1-1)
diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
index 555cbb7a507f4..bde8cc4a89715 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
@@ -3109,12 +3109,8 @@ void InstrRefBasedLDV::buildVLocValueMap(
   SmallPtrSet<const MachineBasicBlock *, 8> BlocksToExplore;
 
   // The order in which to examine them (RPO).
-  SmallVector<MachineBasicBlock *, 8> BlockOrders;
-
-  // RPO ordering function.
-  auto Cmp = [&](MachineBasicBlock *A, MachineBasicBlock *B) {
-    return BBToOrder[A] < BBToOrder[B];
-  };
+  SmallVector<MachineBasicBlock *, 16> BlockOrders;
+  SmallVector<unsigned, 32> BlockOrderNums;
 
   getBlocksForScope(DILoc, BlocksToExplore, AssignBlocks);
 
@@ -3132,11 +3128,16 @@ void InstrRefBasedLDV::buildVLocValueMap(
   for (const auto *MBB : BlocksToExplore)
     MutBlocksToExplore.insert(const_cast<MachineBasicBlock *>(MBB));
 
-  // Picks out relevants blocks RPO order and sort them.
+  // Picks out relevants blocks RPO order and sort them. Sort their
+  // order-numbers and map back to MBB pointers later, to avoid repeated
+  // DenseMap queries during comparisons.
   for (const auto *MBB : BlocksToExplore)
-    BlockOrders.push_back(const_cast<MachineBasicBlock *>(MBB));
+    BlockOrderNums.push_back(BBToOrder[MBB]);
 
-  llvm::sort(BlockOrders, Cmp);
+  llvm::sort(BlockOrderNums);
+  for (unsigned int I : BlockOrderNums)
+    BlockOrders.push_back(OrderToBB[I]);
+  BlockOrderNums.clear();
   unsigned NumBlocks = BlockOrders.size();
 
   // Allocate some vectors for storing the live ins and live outs. Large.
@@ -3396,16 +3397,24 @@ void InstrRefBasedLDV::initialSetup(MachineFunction &MF) {
       return DL.getLine() != 0;
     return false;
   };
-  // Collect a set of all the artificial blocks.
-  for (auto &MBB : MF)
+
+  // Collect a set of all the artificial blocks. Collect the size too, ilist
+  // size calls are O(n).
+  unsigned int Size = 0;
+  for (auto &MBB : MF) {
+    ++Size;
     if (none_of(MBB.instrs(), hasNonArtificialLocation))
       ArtificialBlocks.insert(&MBB);
+  }
 
   // Compute mappings of block <=> RPO order.
   ReversePostOrderTraversal<MachineFunction *> RPOT(&MF);
   unsigned int RPONumber = 0;
+  OrderToBB.reserve(Size);
+  BBToOrder.reserve(Size);
+  BBNumToRPO.reserve(Size);
   auto processMBB = [&](MachineBasicBlock *MBB) {
-    OrderToBB[RPONumber] = MBB;
+    OrderToBB.push_back(MBB);
     BBToOrder[MBB] = RPONumber;
     BBNumToRPO[MBB->getNumber()] = RPONumber;
     ++RPONumber;
@@ -3724,14 +3733,13 @@ bool InstrRefBasedLDV::ExtendRanges(MachineFunction &MF,
 
   // Walk back through each block / instruction, collecting DBG_VALUE
   // instructions and recording what machine value their operands refer to.
-  for (auto &OrderPair : OrderToBB) {
-    MachineBasicBlock &MBB = *OrderPair.second;
-    CurBB = MBB.getNumber();
+  for (MachineBasicBlock *MBB : OrderToBB) {
+    CurBB = MBB->getNumber();
     VTracker = &vlocs[CurBB];
-    VTracker->MBB = &MBB;
-    MTracker->loadFromArray(MInLocs[MBB], CurBB);
+    VTracker->MBB = MBB;
+    MTracker->loadFromArray(MInLocs[*MBB], CurBB);
     CurInst = 1;
-    for (auto &MI : MBB) {
+    for (auto &MI : *MBB) {
       process(MI, &MOutLocs, &MInLocs);
       ++CurInst;
     }
diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
index 6d77a6972f09b..8770983481c2f 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
@@ -1153,7 +1153,7 @@ class InstrRefBasedLDV : public LDVImpl {
   SmallPtrSet<MachineBasicBlock *, 16> ArtificialBlocks;
 
   // Mapping of blocks to and from their RPOT order.
-  DenseMap<unsigned int, MachineBasicBlock *> OrderToBB;
+  SmallVector<MachineBasicBlock *> OrderToBB;
   DenseMap<const MachineBasicBlock *, unsigned int> BBToOrder;
   DenseMap<unsigned, unsigned> BBNumToRPO;
 

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jmorse jmorse merged commit 0b71d80 into llvm:main Jul 17, 2024
9 checks passed
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
This patch adjusts how some data is stored to avoid a number of
un-necessary DenseMap queries. There's no change to the compiler
behaviour, and it's measurably faster on the compile time tracker.

The BlockOrders vector in buildVLocValueMap collects the blocks over
which a variables value have to be determined: however the Cmp ordering
function makes two DenseMap queries to determine the RPO-order of blocks
being compared. And given that sorting is O(N log(N)) comparisons this
isn't fast. So instead, fetch the RPO-numbers of the block collection,
order those, and then map back to the blocks themselves.

The OrderToBB collection mapped an RPO-number to an MBB: it's completely
un-necessary to have DenseMap here, we can just use the RPO number as an
array index. Switch it to a SmallVector and deal with a few consequences
when iterating.

(And for good measure I've jammed in a couple of reserve calls).

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants