-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] Improve StructurizeCFG pass performance: avoid redundant DebugLoc map initialization. NFC. #130568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ry DebugLoc map initialization. NFC.
@llvm/pr-subscribers-llvm-transforms Author: Valery Pykhtin (vpykhtin) ChangesPreviously, the TermDL (BB terminator → DebugLoc) map was initialized at the start of processing each function's region, creating entries for the entire function. This could be inefficient for large functions. This patch improves performance by creating map entries only when needed—when a terminator is being killed or when a flow block is created. Additionally, entries are removed immediately after use, preventing unnecessary map growth and ensuring DebugLocs are not "retracked." A mapless variant was also explored, but due to limited familiarity with the structurizer, it was not pursued further. In my cases, this change improves performance by 2-3×. Full diff: https://github.com/llvm/llvm-project/pull/130568.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
index 89a2a7ac9be3f..db6321cc2f71e 100644
--- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
@@ -361,6 +361,8 @@ class StructurizeCFG {
void rebuildSSA();
+ void setDebugLoc(BranchInst *Br, BasicBlock *BB);
+
public:
void init(Region *R);
bool run(Region *R, DominatorTree *DT);
@@ -595,14 +597,6 @@ void StructurizeCFG::collectInfos() {
// Find the last back edges
analyzeLoops(RN);
}
-
- // Reset the collected term debug locations
- TermDL.clear();
-
- for (BasicBlock &BB : *Func) {
- if (const DebugLoc &DL = BB.getTerminator()->getDebugLoc())
- TermDL[&BB] = DL;
- }
}
/// Insert the missing branch conditions
@@ -924,12 +918,24 @@ void StructurizeCFG::simplifyAffectedPhis() {
} while (Changed);
}
+void StructurizeCFG::setDebugLoc(BranchInst *Br, BasicBlock *BB) {
+ auto I = TermDL.find(BB);
+ if (I == TermDL.end())
+ return;
+
+ Br->setDebugLoc(I->second);
+ TermDL.erase(I);
+}
+
/// Remove phi values from all successors and then remove the terminator.
void StructurizeCFG::killTerminator(BasicBlock *BB) {
Instruction *Term = BB->getTerminator();
if (!Term)
return;
+ if (const DebugLoc &DL = Term->getDebugLoc())
+ TermDL[BB] = DL;
+
for (BasicBlock *Succ : successors(BB))
delPhiValues(BB, Succ);
@@ -974,7 +980,7 @@ void StructurizeCFG::changeExit(RegionNode *Node, BasicBlock *NewExit,
BasicBlock *BB = Node->getNodeAs<BasicBlock>();
killTerminator(BB);
BranchInst *Br = BranchInst::Create(NewExit, BB);
- Br->setDebugLoc(TermDL[BB]);
+ setDebugLoc(Br, BB);
addPhiValues(BB, NewExit);
if (IncludeDominator)
DT->changeImmediateDominator(NewExit, BB);
@@ -990,10 +996,10 @@ BasicBlock *StructurizeCFG::getNextFlow(BasicBlock *Dominator) {
Func, Insert);
FlowSet.insert(Flow);
- // use a temporary variable to avoid a use-after-free if the map's storage is
- // reallocated
- DebugLoc DL = TermDL[Dominator];
- TermDL[Flow] = std::move(DL);
+ auto *Term = Dominator->getTerminator();
+ if (const DebugLoc &DL =
+ Term ? Term->getDebugLoc() : TermDL.lookup(Dominator))
+ TermDL[Flow] = DL;
DT->addNewBlock(Flow, Dominator);
ParentRegion->getRegionInfo()->setRegionFor(Flow, ParentRegion);
@@ -1088,7 +1094,7 @@ void StructurizeCFG::wireFlow(bool ExitUseAllowed,
// let it point to entry and next block
BranchInst *Br = BranchInst::Create(Entry, Next, BoolPoison, Flow);
- Br->setDebugLoc(TermDL[Flow]);
+ setDebugLoc(Br, Flow);
Conditions.push_back(Br);
addPhiValues(Flow, Entry);
DT->changeImmediateDominator(Entry, Flow);
@@ -1129,7 +1135,7 @@ void StructurizeCFG::handleLoops(bool ExitUseAllowed,
LoopEnd = needPrefix(false);
BasicBlock *Next = needPostfix(LoopEnd, ExitUseAllowed);
BranchInst *Br = BranchInst::Create(Next, LoopStart, BoolPoison, LoopEnd);
- Br->setDebugLoc(TermDL[LoopEnd]);
+ setDebugLoc(Br, LoopEnd);
LoopConds.push_back(Br);
addPhiValues(LoopEnd, LoopStart);
setPrevNode(Next);
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/1048 Here is the relevant piece of the build log for the reference
|
The https://lab.llvm.org/buildbot/#/builders/10/builds/1048 failure looks unrelated. |
Previously, the TermDL (BB terminator → DebugLoc) map was initialized at the start of processing each function's region, creating entries for the entire function. This could be inefficient for large functions.
This patch improves performance by creating map entries only when needed—when a terminator is being killed or when a flow block is created. Additionally, entries are removed immediately after use, preventing unnecessary map growth and ensuring DebugLocs are not "retracked."
A mapless variant was also explored, but due to limited familiarity with the structurizer, it was not pursued further.
In my cases, this change improves performance by 2-3×.