Skip to content

Commit 30cabdd

Browse files
mskampRKSimon
andauthored
[X86] Distribute Certain Bitwise Operations over SELECT (#136555)
InstCombine canonicalizes `(select P (and X (- X)) X)` to `(and (select P (- X) umax) X)`. This is counterproductive for the X86 backend when BMI is available because we can encode `(and X (- X))` using the `BLSI` instruction. A similar situation arises if we have `(select P (and X (sub X 1)) X)` (prevents use of `BLSR` instruction) or `(select P (xor X (sub X 1)) X)` (prevents use of `BLSMSK` instruction). Trigger the inverse transformation in the X86 backend if BMI is available and we can use the mentioned BMI instructions. This is done by overriding the appropriate `shouldFoldSelectWithIdentityConstant()` overload. In this way, we get `(select P (and X (- X)) X)` again, which enables the use of `BLSI` (similar for the other cases described above). Alive proofs: https://alive2.llvm.org/ce/z/MT_pKi Fixes #131587, fixes #133848. --------- Co-authored-by: Simon Pilgrim <[email protected]>
1 parent 94c2416 commit 30cabdd

File tree

2 files changed

+892
-1
lines changed

2 files changed

+892
-1
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35618,8 +35618,29 @@ bool X86TargetLowering::isNarrowingProfitable(SDNode *N, EVT SrcVT,
3561835618
bool X86TargetLowering::shouldFoldSelectWithIdentityConstant(
3561935619
unsigned BinOpcode, EVT VT, unsigned SelectOpcode, SDValue X,
3562035620
SDValue Y) const {
35621-
if (SelectOpcode != ISD::VSELECT)
35621+
if (SelectOpcode == ISD::SELECT) {
35622+
if (VT.isVector())
35623+
return false;
35624+
if (!Subtarget.hasBMI() || (VT != MVT::i32 && VT != MVT::i64))
35625+
return false;
35626+
using namespace llvm::SDPatternMatch;
35627+
// BLSI
35628+
if (BinOpcode == ISD::AND && (sd_match(Y, m_Neg(m_Specific(X))) ||
35629+
sd_match(X, m_Neg(m_Specific(Y)))))
35630+
return true;
35631+
// BLSR
35632+
if (BinOpcode == ISD::AND &&
35633+
(sd_match(Y, m_Add(m_Specific(X), m_AllOnes())) ||
35634+
sd_match(X, m_Add(m_Specific(Y), m_AllOnes()))))
35635+
return true;
35636+
// BLSMSK
35637+
if (BinOpcode == ISD::XOR &&
35638+
(sd_match(Y, m_Add(m_Specific(X), m_AllOnes())) ||
35639+
sd_match(X, m_Add(m_Specific(Y), m_AllOnes()))))
35640+
return true;
35641+
3562235642
return false;
35643+
}
3562335644
// TODO: This is too general. There are cases where pre-AVX512 codegen would
3562435645
// benefit. The transform may also be profitable for scalar code.
3562535646
if (!Subtarget.hasAVX512())

0 commit comments

Comments
 (0)