Skip to content

Commit 07d6142

Browse files
author
Kyle Butt
committed
Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough.
If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278866
1 parent 60ea1b4 commit 07d6142

File tree

2 files changed

+44
-0
lines changed

2 files changed

+44
-0
lines changed

llvm/lib/CodeGen/TailDuplicator.cpp

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -518,6 +518,16 @@ bool TailDuplicator::shouldTailDuplicate(const MachineFunction &MF,
518518
else
519519
MaxDuplicateCount = TailDuplicateSize;
520520

521+
// If the block to be duplicated ends in an unanalyzable fallthrough, don't
522+
// duplicate it.
523+
// A similar check is necessary in MachineBlockPlacement to make sure pairs of
524+
// blocks with unanalyzable fallthrough get layed out contiguously.
525+
MachineBasicBlock *PredTBB = nullptr, *PredFBB = nullptr;
526+
SmallVector<MachineOperand, 4> PredCond;
527+
if (TII->analyzeBranch(TailBB, PredTBB, PredFBB, PredCond, true)
528+
&& TailBB.canFallThrough())
529+
return false;
530+
521531
// If the target has hardware branch prediction that can handle indirect
522532
// branches, duplicating them can often make them predictable when there
523533
// are common paths through the code. The limit needs to be high enough
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
; RUN: llc -O2 < %s | FileCheck %s
2+
target datalayout = "e-m:e-i64:64-n32:64"
3+
target triple = "powerpc64le-unknown-linux-gnu"
4+
5+
; Check that the conditional return block of fmax_double3.exit was not
6+
; duplicated into the if.then.i block
7+
; CHECK: # %if.then.i
8+
; CHECK: lxvd2x
9+
; CHECK: stxvd2x
10+
; CHECK-NOT: bclr
11+
; CHECK: {{^}}.LBB{{[0-9_]+}}:
12+
; CHECK-SAME: # %fmax_double3.exit
13+
; CHECK: bclr
14+
; CHECK: # %if.then
15+
; Function Attrs: nounwind
16+
define void @__fmax_double3_3D_exec(<2 x double>* %input6, i1 %bool1, i1 %bool2) #0 {
17+
entry:
18+
br i1 %bool1, label %if.then.i, label %fmax_double3.exit
19+
20+
if.then.i: ; preds = %entry
21+
store <2 x double> zeroinitializer, <2 x double>* %input6, align 32
22+
br label %fmax_double3.exit
23+
24+
fmax_double3.exit: ; preds = %if.then.i, %entry
25+
br i1 %bool2, label %if.then, label %do.end
26+
27+
if.then: ; preds = %fmax_double3.exit
28+
unreachable
29+
30+
do.end: ; preds = %fmax_double3.exit
31+
ret void
32+
}
33+
34+
attributes #0 = { nounwind }

0 commit comments

Comments
 (0)