Skip to content

Commit e1569b4

Browse files
committed
[X86] Distribute Certain Bitwise Operations over SELECT
InstCombine canonicalizes `(select P (and X (- X)) X)` to `(and (select P (- X) umax) X)`. This is counterproductive for the X86 backend when BMI is available because we can encode `(and X (- X))` using the `BLSI` instruction. A similar situation arises if we have `(select P (and X (sub X 1)) X)` (prevents use of `BLSR` instruction) or `(select P (xor X (sub X 1)) X)` (prevents use of `BLSMSK` instruction). Trigger the inverse transformation in the X86 backend if BMI is available and we can use the mentioned BMI instructions. This is done by adjusting the `shouldFoldSelectWithIdentityConstant()` implementation for the X86 backend. In this way, we get `(select P (and X (- X)) X)` again, which enables the use of `BLSI` (similar for the other cases described above). Alive proofs: https://alive2.llvm.org/ce/z/MT_pKi Fixes #131587, fixes #133848.
1 parent e0207b3 commit e1569b4

File tree

2 files changed

+101
-219
lines changed

2 files changed

+101
-219
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
#include "llvm/Analysis/BlockFrequencyInfo.h"
2828
#include "llvm/Analysis/ProfileSummaryInfo.h"
2929
#include "llvm/Analysis/VectorUtils.h"
30+
#include "llvm/CodeGen/ISDOpcodes.h"
3031
#include "llvm/CodeGen/IntrinsicLowering.h"
3132
#include "llvm/CodeGen/LivePhysRegs.h"
3233
#include "llvm/CodeGen/MachineFrameInfo.h"
@@ -35386,8 +35387,26 @@ bool X86TargetLowering::isNarrowingProfitable(SDNode *N, EVT SrcVT,
3538635387
bool X86TargetLowering::shouldFoldSelectWithIdentityConstant(
3538735388
unsigned BinOpcode, EVT VT, unsigned SelectOpcode, SDValue X,
3538835389
SDValue NonIdConstNode) const {
35389-
if (SelectOpcode != ISD::VSELECT)
35390+
if (SelectOpcode == ISD::SELECT) {
35391+
if (VT.isVector())
35392+
return false;
35393+
if (!Subtarget.hasBMI() || (VT != MVT::i32 && VT != MVT::i64))
35394+
return false;
35395+
using namespace llvm::SDPatternMatch;
35396+
// BLSI
35397+
if (BinOpcode == ISD::AND && sd_match(NonIdConstNode, m_Neg(m_Specific(X))))
35398+
return true;
35399+
// BLSR
35400+
if (BinOpcode == ISD::AND &&
35401+
sd_match(NonIdConstNode, m_Add(m_Specific(X), m_AllOnes())))
35402+
return true;
35403+
// BLSMSK
35404+
if (BinOpcode == ISD::XOR &&
35405+
sd_match(NonIdConstNode, m_Add(m_Specific(X), m_AllOnes())))
35406+
return true;
35407+
3539035408
return false;
35409+
}
3539135410
// TODO: This is too general. There are cases where pre-AVX512 codegen would
3539235411
// benefit. The transform may also be profitable for scalar code.
3539335412
if (!Subtarget.hasAVX512())

0 commit comments

Comments
 (0)