Skip to content

Commit 25eb7b0

Browse files
committed
[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND)
This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 . When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted. It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag. The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however. This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`. To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB. It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however. I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction). Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB. Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92015
1 parent 76643c4 commit 25eb7b0

File tree

3 files changed

+16
-6
lines changed

3 files changed

+16
-6
lines changed

llvm/include/llvm/CodeGen/ISDOpcodes.h

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -890,13 +890,18 @@ enum NodeType {
890890
/// BRCOND - Conditional branch. The first operand is the chain, the
891891
/// second is the condition, the third is the block to branch to if the
892892
/// condition is true. If the type of the condition is not i1, then the
893-
/// high bits must conform to getBooleanContents.
893+
/// high bits must conform to getBooleanContents. If the condition is undef,
894+
/// it nondeterministically jumps to the block.
895+
/// TODO: Its semantics w.r.t undef requires further discussion; we need to
896+
/// make it sure that it is consistent with optimizations in MIR & the
897+
/// meaning of IMPLICIT_DEF. See https://reviews.llvm.org/D92015
894898
BRCOND,
895899

896900
/// BR_CC - Conditional branch. The behavior is like that of SELECT_CC, in
897901
/// that the condition is represented as condition code, and two nodes to
898902
/// compare, rather than as a combined SetCC node. The operands in order
899-
/// are chain, cc, lhs, rhs, block to branch to if condition is true.
903+
/// are chain, cc, lhs, rhs, block to branch to if condition is true. If
904+
/// condition is undef, it nondeterministically jumps to the block.
900905
BR_CC,
901906

902907
/// INLINEASM - Represents an inline asm block. This node always has two

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14548,6 +14548,13 @@ SDValue DAGCombiner::visitBRCOND(SDNode *N) {
1454814548
SDValue N1 = N->getOperand(1);
1454914549
SDValue N2 = N->getOperand(2);
1455014550

14551+
// BRCOND(FREEZE(cond)) is equivalent to BRCOND(cond) (both are
14552+
// nondeterministic jumps).
14553+
if (N1->getOpcode() == ISD::FREEZE && N1.hasOneUse()) {
14554+
return DAG.getNode(ISD::BRCOND, SDLoc(N), MVT::Other, Chain,
14555+
N1->getOperand(0), N2);
14556+
}
14557+
1455114558
// If N is a constant we could fold this into a fallthrough or unconditional
1455214559
// branch. However that doesn't happen very often in normal code, because
1455314560
// Instcombine/SimplifyCFG should have handled the available opportunities.

llvm/test/CodeGen/X86/select-prof-codegen.ll

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,13 @@
11
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
22
; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s
33

4-
; TODO: Compiling the select should not create 'seta - testb $1 - jump' sequence.
4+
; Compiling the select should not create 'seta - testb $1 - jump' sequence.
55
define i32 @f(i32 %x, i32 %y) {
66
; CHECK-LABEL: f:
77
; CHECK: # %bb.0: # %entry
88
; CHECK-NEXT: movl %edi, %eax
99
; CHECK-NEXT: cmpl %esi, %edi
10-
; CHECK-NEXT: seta %cl
11-
; CHECK-NEXT: testb $1, %cl
12-
; CHECK-NEXT: jne .LBB0_2
10+
; CHECK-NEXT: ja .LBB0_2
1311
; CHECK-NEXT: # %bb.1: # %select.false
1412
; CHECK-NEXT: movl %esi, %eax
1513
; CHECK-NEXT: .LBB0_2: # %select.end

0 commit comments

Comments
 (0)