Skip to content

Commit 8d2885c

Browse files
committed
[GuardWidening] Improve analysis of potential widening into hotter block
There is a piece of logic in GuardWidening which is very limited, and it happens to ignore implicit control flow, therefore it "works fine" with guards expressed as intrinsic calls. However, when the guards are represented as branches, its limitations create a lot of trouble. The intent here is to make sure that we do not widen code across complex CFG, so that it can end up being in hotter code than it used to be. The old logic was limited to unique immediate successor and that's it. This patch changes the logic there to work the following way: when we need to check if we can widen from `DominatedBlock` into `DominatingBlock`, we first try to find the lowest (by CFG) transitive successor of `DominatingBlock` which is guaranteed to not be significantly colder than the `DominatingBlock`. It means that every time we move to either: - Unique successor of the current block, if it only has one successor; - The only taken successor, if the current block ends with `br(const)`; - If one of successors ends with deopt, another one is taken; If the lowest block we can find this way is the `DominatedBlock`, then it is safe to assume that this widening won't move the code into a hotter location. I did not touch the existing check with PDT. It looks fishy to me (post-dominance doesn't really guarantee anything about hotness), but let's keep it as is and maybe fix later. With this patch, Guard Widening can widen explicitly expressed branches across more than one dominating guard if it's profitable. Differential Revision: https://reviews.llvm.org/D146276 Reviewed By: skatkov
1 parent aa48538 commit 8d2885c

File tree

2 files changed

+56
-36
lines changed

2 files changed

+56
-36
lines changed

llvm/lib/Transforms/Scalar/GuardWidening.cpp

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -460,27 +460,55 @@ GuardWideningImpl::computeWideningScore(Instruction *DominatedInstr,
460460
if (HoistingOutOfLoop)
461461
return WS_Positive;
462462

463-
// Returns true if we might be hoisting above explicit control flow. Note
464-
// that this completely ignores implicit control flow (guards, calls which
465-
// throw, etc...). That choice appears arbitrary.
466-
auto MaybeHoistingOutOfIf = [&]() {
467-
auto *DominatingBlock = DominatingGuard->getParent();
468-
auto *DominatedBlock = DominatedInstr->getParent();
469-
if (isGuardAsWidenableBranch(DominatingGuard))
470-
DominatingBlock = cast<BranchInst>(DominatingGuard)->getSuccessor(0);
463+
// For a given basic block \p BB, return its successor which is guaranteed or
464+
// highly likely will be taken as its successor.
465+
auto GetLikelySuccessor = [](const BasicBlock * BB)->const BasicBlock * {
466+
if (auto *UniqueSucc = BB->getUniqueSuccessor())
467+
return UniqueSucc;
468+
auto *Term = BB->getTerminator();
469+
Value *Cond = nullptr;
470+
const BasicBlock *IfTrue = nullptr, *IfFalse = nullptr;
471+
using namespace PatternMatch;
472+
if (!match(Term, m_Br(m_Value(Cond), m_BasicBlock(IfTrue),
473+
m_BasicBlock(IfFalse))))
474+
return nullptr;
475+
// For constant conditions, only one dynamical successor is possible
476+
if (auto *ConstCond = dyn_cast<ConstantInt>(Cond))
477+
return ConstCond->isAllOnesValue() ? IfTrue : IfFalse;
478+
// If one of successors ends with deopt, another one is likely.
479+
if (IfFalse->getPostdominatingDeoptimizeCall())
480+
return IfTrue;
481+
if (IfTrue->getPostdominatingDeoptimizeCall())
482+
return IfFalse;
483+
// TODO: Use branch frequency metatada to allow hoisting through non-deopt
484+
// branches?
485+
return nullptr;
486+
};
487+
488+
// Returns true if we might be hoisting above explicit control flow into a
489+
// considerably hotter block. Note that this completely ignores implicit
490+
// control flow (guards, calls which throw, etc...). That choice appears
491+
// arbitrary (we assume that implicit control flow exits are all rare).
492+
auto MaybeHoistingToHotterBlock = [&]() {
493+
const auto *DominatingBlock = DominatingGuard->getParent();
494+
const auto *DominatedBlock = DominatedInstr->getParent();
495+
496+
// Descent as low as we can, always taking the likely successor.
497+
while (DominatedBlock != DominatingBlock)
498+
if (auto *LikelySucc = GetLikelySuccessor(DominatingBlock))
499+
DominatingBlock = LikelySucc;
500+
else
501+
break;
471502

472503
// Same Block?
473504
if (DominatedBlock == DominatingBlock)
474505
return false;
475-
// Obvious successor (common loop header/preheader case)
476-
if (DominatedBlock == DominatingBlock->getUniqueSuccessor())
477-
return false;
478506
// TODO: diamond, triangle cases
479507
if (!PDT) return true;
480508
return !PDT->dominates(DominatedBlock, DominatingBlock);
481509
};
482510

483-
return MaybeHoistingOutOfIf() ? WS_IllegalOrNegative : WS_Neutral;
511+
return MaybeHoistingToHotterBlock() ? WS_IllegalOrNegative : WS_Neutral;
484512
}
485513

486514
bool GuardWideningImpl::canBeHoistedTo(

llvm/test/Transforms/GuardWidening/two_forms_behavior_consistency.ll

Lines changed: 16 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -42,30 +42,26 @@ define void @test_01(i32 %a, i32 %b, i32 %c, i32 %d) {
4242
; BRANCH_FORM-NEXT: entry:
4343
; BRANCH_FORM-NEXT: br label [[LOOP:%.*]]
4444
; BRANCH_FORM: loop:
45-
; BRANCH_FORM-NEXT: [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED5:%.*]] ]
45+
; BRANCH_FORM-NEXT: [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED:%.*]] ]
4646
; BRANCH_FORM-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
4747
; BRANCH_FORM-NEXT: [[C1:%.*]] = icmp ult i32 [[IV]], [[A]]
4848
; BRANCH_FORM-NEXT: [[C2:%.*]] = icmp ult i32 [[IV]], [[B]]
4949
; BRANCH_FORM-NEXT: [[WIDE_CHK:%.*]] = and i1 [[C1]], [[C2]]
50+
; BRANCH_FORM-NEXT: [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
51+
; BRANCH_FORM-NEXT: [[WIDE_CHK13:%.*]] = and i1 [[WIDE_CHK]], [[C3]]
52+
; BRANCH_FORM-NEXT: [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
53+
; BRANCH_FORM-NEXT: [[WIDE_CHK14:%.*]] = and i1 [[WIDE_CHK13]], [[C4]]
5054
; BRANCH_FORM-NEXT: [[WIDENABLE_COND:%.*]] = call i1 @llvm.experimental.widenable.condition()
51-
; BRANCH_FORM-NEXT: [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK]], [[WIDENABLE_COND]]
52-
; BRANCH_FORM-NEXT: br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED:%.*]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
55+
; BRANCH_FORM-NEXT: [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK14]], [[WIDENABLE_COND]]
56+
; BRANCH_FORM-NEXT: br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
5357
; BRANCH_FORM: deopt:
5458
; BRANCH_FORM-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
5559
; BRANCH_FORM-NEXT: ret void
5660
; BRANCH_FORM: guarded:
5761
; BRANCH_FORM-NEXT: [[WIDENABLE_COND3:%.*]] = call i1 @llvm.experimental.widenable.condition()
5862
; BRANCH_FORM-NEXT: [[EXIPLICIT_GUARD_COND4:%.*]] = and i1 [[C2]], [[WIDENABLE_COND3]]
59-
; BRANCH_FORM-NEXT: [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
60-
; BRANCH_FORM-NEXT: [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
61-
; BRANCH_FORM-NEXT: [[WIDE_CHK13:%.*]] = and i1 [[C3]], [[C4]]
6263
; BRANCH_FORM-NEXT: [[WIDENABLE_COND7:%.*]] = call i1 @llvm.experimental.widenable.condition()
63-
; BRANCH_FORM-NEXT: [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[WIDE_CHK13]], [[WIDENABLE_COND7]]
64-
; BRANCH_FORM-NEXT: br i1 [[EXIPLICIT_GUARD_COND8]], label [[GUARDED5]], label [[DEOPT6:%.*]], !prof [[PROF0]]
65-
; BRANCH_FORM: deopt6:
66-
; BRANCH_FORM-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
67-
; BRANCH_FORM-NEXT: ret void
68-
; BRANCH_FORM: guarded5:
64+
; BRANCH_FORM-NEXT: [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[C3]], [[WIDENABLE_COND7]]
6965
; BRANCH_FORM-NEXT: [[WIDENABLE_COND11:%.*]] = call i1 @llvm.experimental.widenable.condition()
7066
; BRANCH_FORM-NEXT: [[EXIPLICIT_GUARD_COND12:%.*]] = and i1 [[C4]], [[WIDENABLE_COND11]]
7167
; BRANCH_FORM-NEXT: [[LOOP_COND:%.*]] = call i1 @cond()
@@ -78,30 +74,26 @@ define void @test_01(i32 %a, i32 %b, i32 %c, i32 %d) {
7874
; BRANCH_FORM_LICM-NEXT: entry:
7975
; BRANCH_FORM_LICM-NEXT: br label [[LOOP:%.*]]
8076
; BRANCH_FORM_LICM: loop:
81-
; BRANCH_FORM_LICM-NEXT: [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED5:%.*]] ]
77+
; BRANCH_FORM_LICM-NEXT: [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[GUARDED:%.*]] ]
8278
; BRANCH_FORM_LICM-NEXT: [[IV_NEXT]] = add i32 [[IV]], 1
8379
; BRANCH_FORM_LICM-NEXT: [[C1:%.*]] = icmp ult i32 [[IV]], [[A]]
8480
; BRANCH_FORM_LICM-NEXT: [[C2:%.*]] = icmp ult i32 [[IV]], [[B]]
8581
; BRANCH_FORM_LICM-NEXT: [[WIDE_CHK:%.*]] = and i1 [[C1]], [[C2]]
82+
; BRANCH_FORM_LICM-NEXT: [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
83+
; BRANCH_FORM_LICM-NEXT: [[WIDE_CHK13:%.*]] = and i1 [[WIDE_CHK]], [[C3]]
84+
; BRANCH_FORM_LICM-NEXT: [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
85+
; BRANCH_FORM_LICM-NEXT: [[WIDE_CHK14:%.*]] = and i1 [[WIDE_CHK13]], [[C4]]
8686
; BRANCH_FORM_LICM-NEXT: [[WIDENABLE_COND:%.*]] = call i1 @llvm.experimental.widenable.condition()
87-
; BRANCH_FORM_LICM-NEXT: [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK]], [[WIDENABLE_COND]]
88-
; BRANCH_FORM_LICM-NEXT: br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED:%.*]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
87+
; BRANCH_FORM_LICM-NEXT: [[EXIPLICIT_GUARD_COND:%.*]] = and i1 [[WIDE_CHK14]], [[WIDENABLE_COND]]
88+
; BRANCH_FORM_LICM-NEXT: br i1 [[EXIPLICIT_GUARD_COND]], label [[GUARDED]], label [[DEOPT:%.*]], !prof [[PROF0:![0-9]+]]
8989
; BRANCH_FORM_LICM: deopt:
9090
; BRANCH_FORM_LICM-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
9191
; BRANCH_FORM_LICM-NEXT: ret void
9292
; BRANCH_FORM_LICM: guarded:
9393
; BRANCH_FORM_LICM-NEXT: [[WIDENABLE_COND3:%.*]] = call i1 @llvm.experimental.widenable.condition()
9494
; BRANCH_FORM_LICM-NEXT: [[EXIPLICIT_GUARD_COND4:%.*]] = and i1 [[C2]], [[WIDENABLE_COND3]]
95-
; BRANCH_FORM_LICM-NEXT: [[C3:%.*]] = icmp ult i32 [[IV]], [[C]]
96-
; BRANCH_FORM_LICM-NEXT: [[C4:%.*]] = icmp ult i32 [[IV]], [[D]]
97-
; BRANCH_FORM_LICM-NEXT: [[WIDE_CHK13:%.*]] = and i1 [[C3]], [[C4]]
9895
; BRANCH_FORM_LICM-NEXT: [[WIDENABLE_COND7:%.*]] = call i1 @llvm.experimental.widenable.condition()
99-
; BRANCH_FORM_LICM-NEXT: [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[WIDE_CHK13]], [[WIDENABLE_COND7]]
100-
; BRANCH_FORM_LICM-NEXT: br i1 [[EXIPLICIT_GUARD_COND8]], label [[GUARDED5]], label [[DEOPT6:%.*]], !prof [[PROF0]]
101-
; BRANCH_FORM_LICM: deopt6:
102-
; BRANCH_FORM_LICM-NEXT: call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
103-
; BRANCH_FORM_LICM-NEXT: ret void
104-
; BRANCH_FORM_LICM: guarded5:
96+
; BRANCH_FORM_LICM-NEXT: [[EXIPLICIT_GUARD_COND8:%.*]] = and i1 [[C3]], [[WIDENABLE_COND7]]
10597
; BRANCH_FORM_LICM-NEXT: [[WIDENABLE_COND11:%.*]] = call i1 @llvm.experimental.widenable.condition()
10698
; BRANCH_FORM_LICM-NEXT: [[EXIPLICIT_GUARD_COND12:%.*]] = and i1 [[C4]], [[WIDENABLE_COND11]]
10799
; BRANCH_FORM_LICM-NEXT: [[LOOP_COND:%.*]] = call i1 @cond()

0 commit comments

Comments
 (0)