Skip to content

Commit 5db6eac

Browse files
authored
[X86] Avoid useless DomTree in flags copy lowering (#97628)
Currently, flags copy lowering does two expensive things: - It traverses the CFG in RPO, and - It requires a dominator tree that is not preserved. Most notably, it is the only machine dominator tree user at -O0. Many functions have no flag copies to begin with, therefore, add an early exit if EFLAGS has no COPY def. The legacy pass manager has no way to dynamically decide whether an analysis is required. Therefore, if there's a copy, get the dominator tree from the pass manager, if it has one, otherwise, compute it. These changes should make the pass very cheap for the common case.
1 parent 817f0d9 commit 5db6eac

File tree

3 files changed

+21
-4
lines changed

3 files changed

+21
-4
lines changed

llvm/lib/Target/X86/X86FlagsCopyLowering.cpp

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ FunctionPass *llvm::createX86FlagsCopyLoweringPass() {
128128
char X86FlagsCopyLoweringPass::ID = 0;
129129

130130
void X86FlagsCopyLoweringPass::getAnalysisUsage(AnalysisUsage &AU) const {
131-
AU.addRequired<MachineDominatorTreeWrapperPass>();
131+
AU.addUsedIfAvailable<MachineDominatorTreeWrapperPass>();
132132
MachineFunctionPass::getAnalysisUsage(AU);
133133
}
134134

@@ -258,13 +258,32 @@ bool X86FlagsCopyLoweringPass::runOnMachineFunction(MachineFunction &MF) {
258258
MRI = &MF.getRegInfo();
259259
TII = Subtarget->getInstrInfo();
260260
TRI = Subtarget->getRegisterInfo();
261-
MDT = &getAnalysis<MachineDominatorTreeWrapperPass>().getDomTree();
262261
PromoteRC = &X86::GR8RegClass;
263262

264263
if (MF.empty())
265264
// Nothing to do for a degenerate empty function...
266265
return false;
267266

267+
if (none_of(MRI->def_instructions(X86::EFLAGS), [](const MachineInstr &MI) {
268+
return MI.getOpcode() == TargetOpcode::COPY;
269+
}))
270+
return false;
271+
272+
// We change the code, so we don't preserve the dominator tree anyway. If we
273+
// got a valid MDT from the pass manager, use that, otherwise construct one
274+
// now. This is an optimization that avoids unnecessary MDT construction for
275+
// functions that have no flag copies.
276+
277+
auto MDTWrapper = getAnalysisIfAvailable<MachineDominatorTreeWrapperPass>();
278+
std::unique_ptr<MachineDominatorTree> OwnedMDT;
279+
if (MDTWrapper) {
280+
MDT = &MDTWrapper->getDomTree();
281+
} else {
282+
OwnedMDT = std::make_unique<MachineDominatorTree>();
283+
OwnedMDT->getBase().recalculate(MF);
284+
MDT = OwnedMDT.get();
285+
}
286+
268287
// Collect the copies in RPO so that when there are chains where a copy is in
269288
// turn copied again we visit the first one first. This ensures we can find
270289
// viable locations for testing the original EFLAGS that dominate all the

llvm/test/CodeGen/X86/O0-pipeline.ll

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,6 @@
4444
; CHECK-NEXT: Finalize ISel and expand pseudo-instructions
4545
; CHECK-NEXT: Local Stack Slot Allocation
4646
; CHECK-NEXT: X86 speculative load hardening
47-
; CHECK-NEXT: MachineDominator Tree Construction
4847
; CHECK-NEXT: X86 EFLAGS copy lowering
4948
; CHECK-NEXT: X86 DynAlloca Expander
5049
; CHECK-NEXT: Fast Tile Register Preconfigure

llvm/test/CodeGen/X86/opt-pipeline.ll

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,6 @@
125125
; CHECK-NEXT: X86 Optimize Call Frame
126126
; CHECK-NEXT: X86 Avoid Store Forwarding Block
127127
; CHECK-NEXT: X86 speculative load hardening
128-
; CHECK-NEXT: MachineDominator Tree Construction
129128
; CHECK-NEXT: X86 EFLAGS copy lowering
130129
; CHECK-NEXT: X86 DynAlloca Expander
131130
; CHECK-NEXT: MachineDominator Tree Construction

0 commit comments

Comments
 (0)