Skip to content

Commit f06d644

Browse files
committed
[X86] Distribute Certain Bitwise Operations over SELECT
InstCombine canonicalizes `(select P (and X (- X)) X)` to `(and (select P (- X) umax) X)`. This is counterproductive for the X86 backend when BMI is available because we can encode `(and X (- X))` using the `BLSI` instruction. A similar situation arises if we have `(select P (and X (sub X 1)) X)` (prevents use of `BLSR` instruction) or `(select P (xor X (sub X 1)) X)` (prevents use of `BLSMSK` instruction). Trigger the inverse transformation in the X86 backend if BMI is available and we can use the mentioned BMI instructions. This is done by adjusting the `shouldFoldSelectWithIdentityConstant()` implementation for the X86 backend. In this way, we get `(select P (and X (- X)) X)` again, which enables the use of `BLSI` (similar for the other cases described above). Alive proofs: https://alive2.llvm.org/ce/z/MT_pKi Fixes #131587, fixes #133848.
1 parent 04365dc commit f06d644

File tree

2 files changed

+99
-219
lines changed

2 files changed

+99
-219
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include "llvm/Analysis/BlockFrequencyInfo.h"
2929
#include "llvm/Analysis/ProfileSummaryInfo.h"
3030
#include "llvm/Analysis/VectorUtils.h"
31+
#include "llvm/CodeGen/ISDOpcodes.h"
3132
#include "llvm/CodeGen/IntrinsicLowering.h"
3233
#include "llvm/CodeGen/LivePhysRegs.h"
3334
#include "llvm/CodeGen/MachineFrameInfo.h"
@@ -35552,8 +35553,24 @@ bool X86TargetLowering::isNarrowingProfitable(SDNode *N, EVT SrcVT,
3555235553
bool X86TargetLowering::shouldFoldSelectWithIdentityConstant(
3555335554
unsigned BinOpcode, EVT VT, unsigned SelectOpcode, SDValue X,
3555435555
SDValue Y) const {
35555-
if (SelectOpcode != ISD::VSELECT)
35556+
if (SelectOpcode == ISD::SELECT) {
35557+
if (VT.isVector())
35558+
return false;
35559+
if (!Subtarget.hasBMI() || (VT != MVT::i32 && VT != MVT::i64))
35560+
return false;
35561+
using namespace llvm::SDPatternMatch;
35562+
// BLSI
35563+
if (BinOpcode == ISD::AND && sd_match(Y, m_Neg(m_Specific(X))))
35564+
return true;
35565+
// BLSR
35566+
if (BinOpcode == ISD::AND && sd_match(Y, m_Add(m_Specific(X), m_AllOnes())))
35567+
return true;
35568+
// BLSMSK
35569+
if (BinOpcode == ISD::XOR && sd_match(Y, m_Add(m_Specific(X), m_AllOnes())))
35570+
return true;
35571+
3555635572
return false;
35573+
}
3555735574
// TODO: This is too general. There are cases where pre-AVX512 codegen would
3555835575
// benefit. The transform may also be profitable for scalar code.
3555935576
if (!Subtarget.hasAVX512())

0 commit comments

Comments
 (0)