Skip to content

Commit b3d0c79

Browse files
committed
[DAGCombiner] avoid narrowing fake fneg vector op
This may inhibit vector narrowing in general, but there's already an inconsistency in the way that we deal with this pattern as shown by the test diff. We may want to add a dedicated function for narrowing fneg. It's often folded into some other op, so moving it away from other math ops may cause regressions that we would not see for normal binops. See D73978 for more details.
1 parent 299c3e1 commit b3d0c79

File tree

2 files changed

+11
-3
lines changed

2 files changed

+11
-3
lines changed

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18556,6 +18556,15 @@ static SDValue narrowExtractedVectorBinOp(SDNode *Extract, SelectionDAG &DAG) {
1855618556
if (!TLI.isBinOp(BOpcode) || BinOp.getNode()->getNumValues() != 1)
1855718557
return SDValue();
1855818558

18559+
// Exclude the fake form of fneg (fsub -0.0, x) because that is likely to be
18560+
// reduced to the unary fneg when it is visited, and we probably want to deal
18561+
// with fneg in a target-specific way.
18562+
if (BOpcode == ISD::FSUB) {
18563+
auto *C = isConstOrConstSplatFP(BinOp.getOperand(0), /*AllowUndefs*/ true);
18564+
if (C && C->getValueAPF().isNegZero())
18565+
return SDValue();
18566+
}
18567+
1855918568
// The binop must be a vector type, so we can extract some fraction of it.
1856018569
EVT WideBVT = BinOp.getValueType();
1856118570
if (!WideBVT.isVector())

llvm/test/CodeGen/AArch64/arm64-fp.ll

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,9 +62,8 @@ define <2 x float> @fake_fneg_splat_extract(<4 x float> %rhs) {
6262
define <2 x float> @fake_fneg_splat_extract_undef(<4 x float> %rhs) {
6363
; CHECK-LABEL: fake_fneg_splat_extract_undef:
6464
; CHECK: // %bb.0:
65-
; CHECK-NEXT: ext v0.16b, v0.16b, v0.16b, #8
66-
; CHECK-NEXT: fneg v0.2s, v0.2s
67-
; CHECK-NEXT: dup v0.2s, v0.s[1]
65+
; CHECK-NEXT: fneg v0.4s, v0.4s
66+
; CHECK-NEXT: dup v0.2s, v0.s[3]
6867
; CHECK-NEXT: ret
6968
%rhs_neg = fsub <4 x float> <float undef, float -0.0, float -0.0, float -0.0>, %rhs
7069
%splat = shufflevector <4 x float> %rhs_neg, <4 x float> undef, <2 x i32> <i32 3, i32 3>

0 commit comments

Comments
 (0)