Skip to content

Commit 66911b7

Browse files
authored
[X86] Fold (add X, (srl Y, 7)) -> (sub X, (icmp_sgt 0, Y)) on vXi8 vectors (#143359)
Undo the vectorcombine canonicalisation as SSE has awful vXi8 shift support, but can easily splat the MSB using the PCMPGTB(0,x) trick. Alternative to #143106 which could cause infinite loops between srl/sra conversions Fixes #130549
1 parent 891a2c3 commit 66911b7

File tree

4 files changed

+557
-582
lines changed

4 files changed

+557
-582
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58059,21 +58059,31 @@ static SDValue combineAdd(SDNode *N, SelectionDAG &DAG,
5805958059
}
5806058060
}
5806158061

58062-
// If vectors of i1 are legal, turn (add (zext (vXi1 X)), Y) into
58063-
// (sub Y, (sext (vXi1 X))).
58064-
// FIXME: We have the (sub Y, (zext (vXi1 X))) -> (add (sext (vXi1 X)), Y) in
58065-
// generic DAG combine without a legal type check, but adding this there
58066-
// caused regressions.
5806758062
if (VT.isVector()) {
5806858063
SDValue X, Y;
5806958064
EVT BoolVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1,
5807058065
VT.getVectorElementCount());
58066+
58067+
// If vectors of i1 are legal, turn (add (zext (vXi1 X)), Y) into
58068+
// (sub Y, (sext (vXi1 X))).
58069+
// FIXME: We have the (sub Y, (zext (vXi1 X))) -> (add (sext (vXi1 X)), Y)
58070+
// in generic DAG combine without a legal type check, but adding this there
58071+
// caused regressions.
5807158072
if (DAG.getTargetLoweringInfo().isTypeLegal(BoolVT) &&
5807258073
sd_match(N, m_Add(m_ZExt(m_AllOf(m_SpecificVT(BoolVT), m_Value(X))),
5807358074
m_Value(Y)))) {
5807458075
SDValue SExt = DAG.getNode(ISD::SIGN_EXTEND, DL, VT, X);
5807558076
return DAG.getNode(ISD::SUB, DL, VT, Y, SExt);
5807658077
}
58078+
58079+
// Fold (add X, (srl Y, 7)) -> (sub X, (icmp_sgt 0, Y)) to undo instcombine
58080+
// canonicalisation as we don't have good vXi8 shifts.
58081+
if (VT.getScalarType() == MVT::i8 &&
58082+
sd_match(N, m_Add(m_Value(X), m_Srl(m_Value(Y), m_SpecificInt(7))))) {
58083+
SDValue Cmp =
58084+
DAG.getSetCC(DL, BoolVT, DAG.getConstant(0, DL, VT), Y, ISD::SETGT);
58085+
return DAG.getNode(ISD::SUB, DL, VT, X, DAG.getSExtOrTrunc(Cmp, DL, VT));
58086+
}
5807758087
}
5807858088

5807958089
// Peephole for 512-bit VPDPBSSD on non-VLX targets.

0 commit comments

Comments
 (0)