-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstrRef][NFC] Avoid un-necessary DenseMap queries #99048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adjusts how some data is stored to avoid a number of un-necessary DenseMap queries. There's no change to the compiler behaviour, and it's measurably faster on the compile time tracker. The BlockOrders vector in buildVLocValueMap collects the blocks over which a variables value have to be determined: however the Cmp ordering function makes two DenseMap queries to determine the RPO-order of blocks being compared. And given that sorting is O(N log(N)) comparisons this isn't fast. So instead, fetch the RPO-numbers of the block collection, order those, and then map back to the blocks themselves. The OrderToBB collection mapped an RPO-number to an MBB: it's completely un-necessary to have DenseMap here, we can just use the RPO number as an array index. Switch it to a SmallVector and deal with a few consequences when iterating. (And for good measure I've jammed in a couple of reserve calls).
@llvm/pr-subscribers-debuginfo Author: Jeremy Morse (jmorse) ChangesThis patch adjusts how some data is stored to avoid a number of un-necessary DenseMap queries. There's no change to the compiler behaviour, and it's measurably faster on the compile time tracker. The BlockOrders vector in buildVLocValueMap collects the blocks over which a variables value have to be determined: however the Cmp ordering function makes two DenseMap queries to determine the RPO-order of blocks being compared. And given that sorting is O(N log(N)) comparisons this isn't fast. So instead, fetch the RPO-numbers of the block collection, order those, and then map back to the blocks themselves. The OrderToBB collection mapped an RPO-number to an MBB: it's completely un-necessary to have DenseMap here, we can just use the RPO number as an array index. Switch it to a SmallVector and deal with a few consequences when iterating. (And for good measure I've jammed in a couple of reserve calls). Full diff: https://github.com/llvm/llvm-project/pull/99048.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
index 555cbb7a507f4..bde8cc4a89715 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
@@ -3109,12 +3109,8 @@ void InstrRefBasedLDV::buildVLocValueMap(
SmallPtrSet<const MachineBasicBlock *, 8> BlocksToExplore;
// The order in which to examine them (RPO).
- SmallVector<MachineBasicBlock *, 8> BlockOrders;
-
- // RPO ordering function.
- auto Cmp = [&](MachineBasicBlock *A, MachineBasicBlock *B) {
- return BBToOrder[A] < BBToOrder[B];
- };
+ SmallVector<MachineBasicBlock *, 16> BlockOrders;
+ SmallVector<unsigned, 32> BlockOrderNums;
getBlocksForScope(DILoc, BlocksToExplore, AssignBlocks);
@@ -3132,11 +3128,16 @@ void InstrRefBasedLDV::buildVLocValueMap(
for (const auto *MBB : BlocksToExplore)
MutBlocksToExplore.insert(const_cast<MachineBasicBlock *>(MBB));
- // Picks out relevants blocks RPO order and sort them.
+ // Picks out relevants blocks RPO order and sort them. Sort their
+ // order-numbers and map back to MBB pointers later, to avoid repeated
+ // DenseMap queries during comparisons.
for (const auto *MBB : BlocksToExplore)
- BlockOrders.push_back(const_cast<MachineBasicBlock *>(MBB));
+ BlockOrderNums.push_back(BBToOrder[MBB]);
- llvm::sort(BlockOrders, Cmp);
+ llvm::sort(BlockOrderNums);
+ for (unsigned int I : BlockOrderNums)
+ BlockOrders.push_back(OrderToBB[I]);
+ BlockOrderNums.clear();
unsigned NumBlocks = BlockOrders.size();
// Allocate some vectors for storing the live ins and live outs. Large.
@@ -3396,16 +3397,24 @@ void InstrRefBasedLDV::initialSetup(MachineFunction &MF) {
return DL.getLine() != 0;
return false;
};
- // Collect a set of all the artificial blocks.
- for (auto &MBB : MF)
+
+ // Collect a set of all the artificial blocks. Collect the size too, ilist
+ // size calls are O(n).
+ unsigned int Size = 0;
+ for (auto &MBB : MF) {
+ ++Size;
if (none_of(MBB.instrs(), hasNonArtificialLocation))
ArtificialBlocks.insert(&MBB);
+ }
// Compute mappings of block <=> RPO order.
ReversePostOrderTraversal<MachineFunction *> RPOT(&MF);
unsigned int RPONumber = 0;
+ OrderToBB.reserve(Size);
+ BBToOrder.reserve(Size);
+ BBNumToRPO.reserve(Size);
auto processMBB = [&](MachineBasicBlock *MBB) {
- OrderToBB[RPONumber] = MBB;
+ OrderToBB.push_back(MBB);
BBToOrder[MBB] = RPONumber;
BBNumToRPO[MBB->getNumber()] = RPONumber;
++RPONumber;
@@ -3724,14 +3733,13 @@ bool InstrRefBasedLDV::ExtendRanges(MachineFunction &MF,
// Walk back through each block / instruction, collecting DBG_VALUE
// instructions and recording what machine value their operands refer to.
- for (auto &OrderPair : OrderToBB) {
- MachineBasicBlock &MBB = *OrderPair.second;
- CurBB = MBB.getNumber();
+ for (MachineBasicBlock *MBB : OrderToBB) {
+ CurBB = MBB->getNumber();
VTracker = &vlocs[CurBB];
- VTracker->MBB = &MBB;
- MTracker->loadFromArray(MInLocs[MBB], CurBB);
+ VTracker->MBB = MBB;
+ MTracker->loadFromArray(MInLocs[*MBB], CurBB);
CurInst = 1;
- for (auto &MI : MBB) {
+ for (auto &MI : *MBB) {
process(MI, &MOutLocs, &MInLocs);
++CurInst;
}
diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
index 6d77a6972f09b..8770983481c2f 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
@@ -1153,7 +1153,7 @@ class InstrRefBasedLDV : public LDVImpl {
SmallPtrSet<MachineBasicBlock *, 16> ArtificialBlocks;
// Mapping of blocks to and from their RPOT order.
- DenseMap<unsigned int, MachineBasicBlock *> OrderToBB;
+ SmallVector<MachineBasicBlock *> OrderToBB;
DenseMap<const MachineBasicBlock *, unsigned int> BBToOrder;
DenseMap<unsigned, unsigned> BBNumToRPO;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary: This patch adjusts how some data is stored to avoid a number of un-necessary DenseMap queries. There's no change to the compiler behaviour, and it's measurably faster on the compile time tracker. The BlockOrders vector in buildVLocValueMap collects the blocks over which a variables value have to be determined: however the Cmp ordering function makes two DenseMap queries to determine the RPO-order of blocks being compared. And given that sorting is O(N log(N)) comparisons this isn't fast. So instead, fetch the RPO-numbers of the block collection, order those, and then map back to the blocks themselves. The OrderToBB collection mapped an RPO-number to an MBB: it's completely un-necessary to have DenseMap here, we can just use the RPO number as an array index. Switch it to a SmallVector and deal with a few consequences when iterating. (And for good measure I've jammed in a couple of reserve calls). Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251013
This patch adjusts how some data is stored to avoid a number of un-necessary DenseMap queries. There's no change to the compiler behaviour, and it's measurably faster on the compile time tracker.
The BlockOrders vector in buildVLocValueMap collects the blocks over which a variables value have to be determined: however the Cmp ordering function makes two DenseMap queries to determine the RPO-order of blocks being compared. And given that sorting is O(N log(N)) comparisons this isn't fast. So instead, fetch the RPO-numbers of the block collection, order those, and then map back to the blocks themselves.
The OrderToBB collection mapped an RPO-number to an MBB: it's completely un-necessary to have DenseMap here, we can just use the RPO number as an array index. Switch it to a SmallVector and deal with a few consequences when iterating.
(And for good measure I've jammed in a couple of reserve calls).