[InstCombine] Improve folding of `icmp pred (and X, Mask/~Mask), Y)` #81562

goldsteinn · 2024-02-13T01:44:51Z

[InstCombine] Add tests for expanding foldICmpWithLowBitMaskedVal; NFC
[InstCombine] Improve mask detection in foldICmpWithLowBitMaskedVal
[InstCombine] Recognize (icmp eq/ne (and X, ~Mask), 0) pattern in foldICmpWithLowBitMaskedVal

Proofs for new patterns: https://alive2.llvm.org/ce/z/fSQ3nZ

llvmbot · 2024-02-13T01:45:20Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-ir

Author: None (goldsteinn)

Changes

[InstCombine] Add tests for expanding foldICmpWithLowBitMaskedVal; NFC
[InstCombine] Improve mask detection in foldICmpWithLowBitMaskedVal
[InstCombine] Recognize (icmp eq/ne (and X, ~Mask), 0) pattern in foldICmpWithLowBitMaskedVal

Patch is 47.65 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/81562.diff

17 Files Affected:

(modified) llvm/include/llvm/IR/PatternMatch.h (+25)
(modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+145-15)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ne-to-icmp-ugt.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sge-to-icmp-sle.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sgt-to-icmp-sgt.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sle-to-icmp-sle.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-slt-to-icmp-sgt.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-uge-to-icmp-ule.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ugt-to-icmp-ugt.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ule-to-icmp-ule.ll (+1-2)
(modified) llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ult-to-icmp-ugt.ll (+1-2)
(added) llvm/test/Transforms/InstCombine/icmp-and-lowbit-mask.ll (+629)
(modified) llvm/test/Transforms/InstCombine/lshr-and-negC-icmpeq-zero.ll (+3-6)
(modified) llvm/test/Transforms/InstCombine/lshr-and-signbit-icmpeq-zero.ll (+3-6)
(modified) llvm/test/Transforms/InstCombine/shl-and-negC-icmpeq-zero.ll (+3-6)
(modified) llvm/test/Transforms/InstCombine/shl-and-signbit-icmpeq-zero.ll (+3-6)

diff --git a/llvm/include/llvm/IR/PatternMatch.h b/llvm/include/llvm/IR/PatternMatch.h
index fed552414298ad..487ae170210de9 100644
--- a/llvm/include/llvm/IR/PatternMatch.h
+++ b/llvm/include/llvm/IR/PatternMatch.h
@@ -564,6 +564,19 @@ inline api_pred_ty<is_negated_power2> m_NegatedPower2(const APInt *&V) {
   return V;
 }
 
+struct is_negated_power2_or_zero {
+  bool isValue(const APInt &C) { return !C || C.isNegatedPowerOf2(); }
+};
+/// Match a integer or vector negated power-of-2.
+/// For vectors, this includes constants with undefined elements.
+inline cst_pred_ty<is_negated_power2_or_zero> m_NegatedPower2OrZero() {
+  return cst_pred_ty<is_negated_power2_or_zero>();
+}
+inline api_pred_ty<is_negated_power2_or_zero>
+m_NegatedPower2OrZero(const APInt *&V) {
+  return V;
+}
+
 struct is_power2_or_zero {
   bool isValue(const APInt &C) { return !C || C.isPowerOf2(); }
 };
@@ -595,6 +608,18 @@ inline cst_pred_ty<is_lowbit_mask> m_LowBitMask() {
 }
 inline api_pred_ty<is_lowbit_mask> m_LowBitMask(const APInt *&V) { return V; }
 
+struct is_lowbit_mask_or_zero {
+  bool isValue(const APInt &C) { return !C || C.isMask(); }
+};
+/// Match an integer or vector with only the low bit(s) set.
+/// For vectors, this includes constants with undefined elements.
+inline cst_pred_ty<is_lowbit_mask_or_zero> m_LowBitMaskOrZero() {
+  return cst_pred_ty<is_lowbit_mask_or_zero>();
+}
+inline api_pred_ty<is_lowbit_mask_or_zero> m_LowBitMaskOrZero(const APInt *&V) {
+  return V;
+}
+
 struct icmp_pred_with_threshold {
   ICmpInst::Predicate Pred;
   const APInt *Thr;
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 280c4d77b6dfca..b33004aefb7405 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -4068,11 +4068,115 @@ Instruction *InstCombinerImpl::foldSelectICmp(ICmpInst::Predicate Pred,
   return nullptr;
 }
 
+// Returns of V is a Mask ((X + 1) & X == 0) or ~Mask (-Pow2OrZero)
+static bool isMaskOrZero(const Value *V, bool Not, const SimplifyQuery &Q,
+                         unsigned Depth = 0) {
+  if (Not ? match(V, m_NegatedPower2OrZero()) : match(V, m_LowBitMaskOrZero()))
+    return true;
+  if (V->getType()->getScalarSizeInBits() == 1)
+    return true;
+  if (Depth++ >= MaxAnalysisRecursionDepth)
+    return false;
+  Value *X;
+  if (match(V, m_Not(m_Value(X))))
+    return isMaskOrZero(X, !Not, Q, Depth);
+  const Operator *I = dyn_cast<Operator>(V);
+  if (I == nullptr)
+    return false;
+  switch (I->getOpcode()) {
+  case Instruction::ZExt:
+    // ZExt(Mask) is a Mask.
+    return !Not && isMaskOrZero(I->getOperand(0), Not, Q, Depth);
+  case Instruction::SExt:
+    // SExt(Mask) is a Mask.
+    // SExt(~Mask) is a ~Mask.
+    return isMaskOrZero(I->getOperand(0), Not, Q, Depth);
+  case Instruction::And:
+  case Instruction::Or:
+    // Mask0 | Mask1 is a Mask.
+    // Mask0 & Mask1 is a Mask.
+    // ~Mask0 | ~Mask1 is a ~Mask.
+    // ~Mask0 & ~Mask1 is a ~Mask.
+    return isMaskOrZero(I->getOperand(1), Not, Q, Depth) &&
+           isMaskOrZero(I->getOperand(0), Not, Q, Depth);
+  case Instruction::Xor:
+    // (X ^ (X - 1)) is a Mask
+    return match(V, m_c_Xor(m_Value(X), m_Add(m_Deferred(X), m_AllOnes())));
+  case Instruction::Select:
+    // c ? Mask0 : Mask1 is a Mask.
+    return isMaskOrZero(I->getOperand(1), Not, Q, Depth) &&
+           isMaskOrZero(I->getOperand(2), Not, Q, Depth);
+  case Instruction::Shl:
+    if (Not) {
+      // (-1 >> X) << X is ~Mask
+      if (match(I->getOperand(0),
+                m_Shr(m_AllOnes(), m_Specific(I->getOperand(1)))))
+        return true;
+
+      // (~Mask) << X is a ~Mask.
+      return isMaskOrZero(I->getOperand(0), Not, Q, Depth);
+    }
+    break;
+  case Instruction::LShr:
+    if (!Not) {
+      // (-1 << X) >> X is a Mask
+      if (match(I->getOperand(0),
+                m_Shl(m_AllOnes(), m_Specific(I->getOperand(1)))))
+        return true;
+      // Mask >> X is a Mask.
+      return isMaskOrZero(I->getOperand(0), Not, Q, Depth);
+    }
+    return false;
+  case Instruction::AShr:
+    // Mask s>> X is a Mask.
+    // ~Mask s>> X is a ~Mask.
+    return isMaskOrZero(I->getOperand(0), Not, Q, Depth);
+  case Instruction::Add:
+    // Pow2 - 1 is a Mask.
+    if (!Not && match(I->getOperand(1), m_AllOnes()))
+      return isKnownToBeAPowerOfTwo(I->getOperand(0), Q.DL, /*OrZero*/ true,
+                                    Depth, Q.AC, Q.CxtI, Q.DT);
+    break;
+  case Instruction::Sub:
+    // -Pow2 is a ~Mask.
+    if (Not && match(I->getOperand(0), m_Zero()))
+      return isKnownToBeAPowerOfTwo(I->getOperand(1), Q.DL, /*OrZero*/ true,
+                                    Depth, Q.AC, Q.CxtI, Q.DT);
+    break;
+  case Instruction::Invoke:
+  case Instruction::Call: {
+    if (auto *II = dyn_cast<IntrinsicInst>(I)) {
+      switch (II->getIntrinsicID()) {
+        // min/max(Mask0, Mask1) is a Mask.
+        // min/max(~Mask0, ~Mask1) is a ~Mask.
+      case Intrinsic::umax:
+      case Intrinsic::smax:
+      case Intrinsic::umin:
+      case Intrinsic::smin:
+        return isMaskOrZero(II->getArgOperand(1), Not, Q, Depth) &&
+               isMaskOrZero(II->getArgOperand(0), Not, Q, Depth);
+
+        // In the context of masks, bitreverse(Mask) == ~Mask
+      case Intrinsic::bitreverse:
+        return isMaskOrZero(II->getArgOperand(0), !Not, Q, Depth);
+      default:
+        break;
+      }
+    }
+    break;
+  }
+  default:
+    break;
+  }
+  return false;
+}
+
 /// Some comparisons can be simplified.
 /// In this case, we are looking for comparisons that look like
 /// a check for a lossy truncation.
 /// Folds:
 ///   icmp SrcPred (x & Mask), x    to    icmp DstPred x, Mask
+///   icmp eq/ne (x & ~Mask), 0     to    icmp DstPred x, Mask
 /// Where Mask is some pattern that produces all-ones in low bits:
 ///    (-1 >> y)
 ///    ((-1 << y) >> y)     <- non-canonical, has extra uses
@@ -4081,21 +4185,45 @@ Instruction *InstCombinerImpl::foldSelectICmp(ICmpInst::Predicate Pred,
 /// The Mask can be a constant, too.
 /// For some predicates, the operands are commutative.
 /// For others, x can only be on a specific side.
-static Value *foldICmpWithLowBitMaskedVal(ICmpInst &I,
-                                          InstCombiner::BuilderTy &Builder) {
+static Value *foldICmpWithLowBitMaskedVal(ICmpInst &I, const SimplifyQuery &Q,
+                                          InstCombiner &IC) {
+
+  Value *X, *M;
+  ICmpInst::Predicate Pred = I.getPredicate();
   ICmpInst::Predicate SrcPred;
-  Value *X, *M, *Y;
-  auto m_VariableMask = m_CombineOr(
-      m_CombineOr(m_Not(m_Shl(m_AllOnes(), m_Value())),
-                  m_Add(m_Shl(m_One(), m_Value()), m_AllOnes())),
-      m_CombineOr(m_LShr(m_AllOnes(), m_Value()),
-                  m_LShr(m_Shl(m_AllOnes(), m_Value(Y)), m_Deferred(Y))));
-  auto m_Mask = m_CombineOr(m_VariableMask, m_LowBitMask());
-  if (!match(&I, m_c_ICmp(SrcPred,
-                          m_c_And(m_CombineAnd(m_Mask, m_Value(M)), m_Value(X)),
-                          m_Deferred(X))))
-    return nullptr;
+  bool NeedsNot = false;
+
+  auto CheckMask = [&](Value *V, bool Not) {
+    if (!ICmpInst::isSigned(Pred))
+      return isMaskOrZero(V, Not, Q);
+    return Not ? match(V, m_NegatedPower2OrZero())
+               : match(V, m_LowBitMaskOrZero());
+  };
+
+  auto TryMatch = [&](unsigned OpNo) {
+    SrcPred = Pred;
+    if (match(I.getOperand(OpNo),
+              m_c_And(m_Specific(I.getOperand(1 - OpNo)), m_Value(M)))) {
+      X = I.getOperand(1 - OpNo);
+      if (OpNo)
+        SrcPred = ICmpInst::getSwappedPredicate(Pred);
+      return CheckMask(M, /*Not*/ false);
+    }
+    if (OpNo == 1 && match(I.getOperand(1), m_Zero()) &&
+        ICmpInst::isEquality(Pred) &&
+        match(I.getOperand(0), m_OneUse(m_And(m_Value(X), m_Value(M))))) {
+      NeedsNot = true;
+      if (IC.isFreeToInvert(X, X->hasOneUse()) && CheckMask(X, /*Not*/ true)) {
+        std::swap(X, M);
+        return true;
+      }
+      return IC.isFreeToInvert(M, M->hasOneUse()) && CheckMask(M, /*Not*/ true);
+    }
+    return false;
+  };
 
+  if (!TryMatch(0) && !TryMatch(1))
+    return nullptr;
   ICmpInst::Predicate DstPred;
   switch (SrcPred) {
   case ICmpInst::Predicate::ICMP_EQ:
@@ -4163,7 +4291,9 @@ static Value *foldICmpWithLowBitMaskedVal(ICmpInst &I,
     M = Constant::replaceUndefsWith(VecC, SafeReplacementConstant);
   }
 
-  return Builder.CreateICmp(DstPred, X, M);
+  if (NeedsNot)
+    M = IC.Builder.CreateNot(M);
+  return IC.Builder.CreateICmp(DstPred, X, M);
 }
 
 /// Some comparisons can be simplified.
@@ -5080,7 +5210,7 @@ Instruction *InstCombinerImpl::foldICmpBinOp(ICmpInst &I,
   if (Value *V = foldMultiplicationOverflowCheck(I))
     return replaceInstUsesWith(I, V);
 
-  if (Value *V = foldICmpWithLowBitMaskedVal(I, Builder))
+  if (Value *V = foldICmpWithLowBitMaskedVal(I, Q, *this))
     return replaceInstUsesWith(I, V);
 
   if (Instruction *R = foldICmpAndXX(I, Q, *this))
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll
index a957fb2d088ef4..5b7a99d53c308c 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll
@@ -62,8 +62,7 @@ define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
 
 define <2 x i1> @p2_vec_nonsplat_edgecase0(<2 x i8> %x) {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase0(
-; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[X:%.*]], <i8 -4, i8 -1>
-; CHECK-NEXT:    [[RET:%.*]] = icmp eq <2 x i8> [[TMP1]], zeroinitializer
+; CHECK-NEXT:    [[RET:%.*]] = icmp ult <2 x i8> [[X:%.*]], <i8 4, i8 1>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %tmp0 = and <2 x i8> %x, <i8 3, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ne-to-icmp-ugt.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ne-to-icmp-ugt.ll
index 57361cdf38977c..160d968b9ac4c7 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ne-to-icmp-ugt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ne-to-icmp-ugt.ll
@@ -62,8 +62,7 @@ define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
 
 define <2 x i1> @p2_vec_nonsplat_edgecase0(<2 x i8> %x) {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase0(
-; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[X:%.*]], <i8 -4, i8 -1>
-; CHECK-NEXT:    [[RET:%.*]] = icmp ne <2 x i8> [[TMP1]], zeroinitializer
+; CHECK-NEXT:    [[RET:%.*]] = icmp ugt <2 x i8> [[X:%.*]], <i8 3, i8 0>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %tmp0 = and <2 x i8> %x, <i8 3, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sge-to-icmp-sle.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sge-to-icmp-sle.ll
index 0dfc9f51baf9c2..60921042d52435 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sge-to-icmp-sle.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sge-to-icmp-sle.ll
@@ -50,8 +50,7 @@ define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
 
 define <2 x i1> @p2_vec_nonsplat_edgecase(<2 x i8> %x) {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase(
-; CHECK-NEXT:    [[TMP0:%.*]] = and <2 x i8> [[X:%.*]], <i8 3, i8 0>
-; CHECK-NEXT:    [[RET:%.*]] = icmp sge <2 x i8> [[TMP0]], [[X]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp slt <2 x i8> [[X:%.*]], <i8 4, i8 1>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %tmp0 = and <2 x i8> %x, <i8 3, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sgt-to-icmp-sgt.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sgt-to-icmp-sgt.ll
index e0893ce4cf2ecb..6345e70d7220e2 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sgt-to-icmp-sgt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sgt-to-icmp-sgt.ll
@@ -63,8 +63,7 @@ define <2 x i1> @p2_vec_nonsplat() {
 define <2 x i1> @p2_vec_nonsplat_edgecase() {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase(
 ; CHECK-NEXT:    [[X:%.*]] = call <2 x i8> @gen2x8()
-; CHECK-NEXT:    [[TMP0:%.*]] = and <2 x i8> [[X]], <i8 3, i8 0>
-; CHECK-NEXT:    [[RET:%.*]] = icmp sgt <2 x i8> [[X]], [[TMP0]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp sgt <2 x i8> [[X]], <i8 3, i8 0>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %x = call <2 x i8> @gen2x8()
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sle-to-icmp-sle.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sle-to-icmp-sle.ll
index 81887a39091573..b7aec53fed6760 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sle-to-icmp-sle.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-sle-to-icmp-sle.ll
@@ -63,8 +63,7 @@ define <2 x i1> @p2_vec_nonsplat() {
 define <2 x i1> @p2_vec_nonsplat_edgecase() {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase(
 ; CHECK-NEXT:    [[X:%.*]] = call <2 x i8> @gen2x8()
-; CHECK-NEXT:    [[TMP0:%.*]] = and <2 x i8> [[X]], <i8 3, i8 0>
-; CHECK-NEXT:    [[RET:%.*]] = icmp sle <2 x i8> [[X]], [[TMP0]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp slt <2 x i8> [[X]], <i8 4, i8 1>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %x = call <2 x i8> @gen2x8()
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-slt-to-icmp-sgt.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-slt-to-icmp-sgt.ll
index 8ce8687f198446..56661d335c4f60 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-slt-to-icmp-sgt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-slt-to-icmp-sgt.ll
@@ -50,8 +50,7 @@ define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
 
 define <2 x i1> @p2_vec_nonsplat_edgecase(<2 x i8> %x) {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase(
-; CHECK-NEXT:    [[TMP0:%.*]] = and <2 x i8> [[X:%.*]], <i8 3, i8 0>
-; CHECK-NEXT:    [[RET:%.*]] = icmp slt <2 x i8> [[TMP0]], [[X]]
+; CHECK-NEXT:    [[RET:%.*]] = icmp sgt <2 x i8> [[X:%.*]], <i8 3, i8 0>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %tmp0 = and <2 x i8> %x, <i8 3, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-uge-to-icmp-ule.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-uge-to-icmp-ule.ll
index ff09e255185b5a..a93e8f779435fc 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-uge-to-icmp-ule.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-uge-to-icmp-ule.ll
@@ -62,8 +62,7 @@ define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
 
 define <2 x i1> @p2_vec_nonsplat_edgecase0(<2 x i8> %x) {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase0(
-; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[X:%.*]], <i8 -4, i8 -1>
-; CHECK-NEXT:    [[RET:%.*]] = icmp eq <2 x i8> [[TMP1]], zeroinitializer
+; CHECK-NEXT:    [[RET:%.*]] = icmp ult <2 x i8> [[X:%.*]], <i8 4, i8 1>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %tmp0 = and <2 x i8> %x, <i8 3, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ugt-to-icmp-ugt.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ugt-to-icmp-ugt.ll
index 4ad04710fd7bb9..73ea4d456d2462 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ugt-to-icmp-ugt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ugt-to-icmp-ugt.ll
@@ -75,8 +75,7 @@ define <2 x i1> @p2_vec_nonsplat() {
 define <2 x i1> @p2_vec_nonsplat_edgecase0() {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase0(
 ; CHECK-NEXT:    [[X:%.*]] = call <2 x i8> @gen2x8()
-; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[X]], <i8 -4, i8 -1>
-; CHECK-NEXT:    [[RET:%.*]] = icmp ne <2 x i8> [[TMP1]], zeroinitializer
+; CHECK-NEXT:    [[RET:%.*]] = icmp ugt <2 x i8> [[X]], <i8 3, i8 0>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %x = call <2 x i8> @gen2x8()
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ule-to-icmp-ule.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ule-to-icmp-ule.ll
index 8e513dcbf4ef3a..53886b5f2dc9c3 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ule-to-icmp-ule.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ule-to-icmp-ule.ll
@@ -75,8 +75,7 @@ define <2 x i1> @p2_vec_nonsplat() {
 define <2 x i1> @p2_vec_nonsplat_edgecase0() {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase0(
 ; CHECK-NEXT:    [[X:%.*]] = call <2 x i8> @gen2x8()
-; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[X]], <i8 -4, i8 -1>
-; CHECK-NEXT:    [[RET:%.*]] = icmp eq <2 x i8> [[TMP1]], zeroinitializer
+; CHECK-NEXT:    [[RET:%.*]] = icmp ult <2 x i8> [[X]], <i8 4, i8 1>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %x = call <2 x i8> @gen2x8()
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ult-to-icmp-ugt.ll b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ult-to-icmp-ugt.ll
index d02ecf6965e878..d66be571008c2f 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ult-to-icmp-ugt.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-ult-to-icmp-ugt.ll
@@ -62,8 +62,7 @@ define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
 
 define <2 x i1> @p2_vec_nonsplat_edgecase0(<2 x i8> %x) {
 ; CHECK-LABEL: @p2_vec_nonsplat_edgecase0(
-; CHECK-NEXT:    [[TMP1:%.*]] = and <2 x i8> [[X:%.*]], <i8 -4, i8 -1>
-; CHECK-NEXT:    [[RET:%.*]] = icmp ne <2 x i8> [[TMP1]], zeroinitializer
+; CHECK-NEXT:    [[RET:%.*]] = icmp ugt <2 x i8> [[X:%.*]], <i8 3, i8 0>
 ; CHECK-NEXT:    ret <2 x i1> [[RET]]
 ;
   %tmp0 = and <2 x i8> %x, <i8 3, i8 0>
diff --git a/llvm/test/Transforms/InstCombine/icmp-and-lowbit-mask.ll b/llvm/test/Transforms/InstCombine/icmp-and-lowbit-mask.ll
new file mode 100644
index 00000000000000..b06efaac25d169
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/icmp-and-lowbit-mask.ll
@@ -0,0 +1,629 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+declare void @llvm.assume(i1)
+declare i8 @llvm.ctpop.i8(i8)
+declare i8 @llvm.umin.i8(i8, i8)
+declare i8 @llvm.umax.i8(i8, i8)
+declare i8 @llvm.smin.i8(i8, i8)
+declare i8 @llvm.smax.i8(i8, i8)
+declare i8 @llvm.bitreverse.i8(i8)
+declare void @use.i8(i8)
+declare void @use.i16(i16)
+define i1 @src_is_mask_zext(i16 %x_in, i8 %y) {
+; CHECK-LABEL: @src_is_mask_zext(
+; CHECK-NEXT:    [[X:%.*]] = xor i16 [[X_IN:%.*]], 123
+; CHECK-NEXT:    [[M_IN:%.*]] = lshr i8 -1, [[Y:%.*]]
+; CHECK-NEXT:    [[MASK:%.*]] = zext i8 [[M_IN]] to i16
+; CHECK-NEXT:    [[R:%.*]] = icmp ule i16 [[X]], [[MASK]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %x = xor i16 %x_in, 123
+  %m_in = lshr i8 -1, %y
+  %mask = zext i8 %m_in to i16
+
+  %and = and i16 %x, %mask
+  %r = icmp eq i16 %and, %x
+  ret i1 %r
+}
+
+define i1 @src_is_mask_zext_fail_not_mask(i16 %x_in, i8 %y) {
+; CHECK-LABEL: @src_is_mask_zext_fail_not_mask(
+; CHECK-NEXT:    [[X:%.*]] = xor i16 [[X_IN:%.*]], 123
+; CHECK-NEXT:    [[M_IN:%.*]] = lshr i8 -2, [[Y:%.*]]
+; CHECK-NEXT:    [[MASK:%.*]] = zext i8 [[M_IN]] to i16
+; CHECK-NEXT:    [[AND:%.*]] = and i16 [[X]], [[MASK]]
+; CHECK-NEXT:    [[R:%.*]] = icmp eq i16 [[AND]], [[X]]
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %x = xor i16 %x_in, 123
+  %m_in = lshr i8 -2, %y
+  %mask = zext i8 %m_in to i16
+
+  %and = and i16 %x, %mask
+  %r = icmp eq i16 %and, %x
+  ret i1 %r
+}
+
+define i1 @src_is_mask_sext(i16 %x_in, i8 %y) {
+; CHECK-LABEL: @src_is_mask_sext(
+; CHECK-NEXT:    [[X:%.*]] = xor i16 [[X_IN:%.*]], 123
+; CHECK-NEXT:    [[M_IN:%.*]] = lshr i8 31, [[Y:%.*]]
+; CHECK-NEXT:    [[MASK:%.*]] = zext nneg i8 [[M_IN]] to...
[truncated]

AZero13

These changes look great!

PR Link: llvm/llvm-project#81562

llvm/include/llvm/IR/PatternMatch.h

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

llvm/test/Transforms/InstCombine/icmp-and-lowbit-mask.ll

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

github-actions · 2024-02-13T05:39:22Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

dtcxzyw · 2024-02-13T08:48:29Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

@@ -4163,7 +4293,9 @@ static Value *foldICmpWithLowBitMaskedVal(ICmpInst &I,
    M = Constant::replaceUndefsWith(VecC, SafeReplacementConstant);
  }

-  return Builder.CreateICmp(DstPred, X, M);
+  if (NeedsNot)
+    M = IC.Builder.CreateNot(M);


Use IC.getFreelyInverted?

If its freely invertable (not M) will simplify later on. Seems simpler to let the existing pipeline clean it up. No?

goldsteinn · 2024-02-19T16:56:07Z

ping

goldsteinn · 2024-02-22T16:34:54Z

ping

PR Link: llvm/llvm-project#81562

dtcxzyw

LGTM. Please wait for additional approval from other reviewers.

goldsteinn · 2024-02-29T17:56:03Z

rebased

goldsteinn · 2024-03-04T22:31:52Z

ping @nikic

goldsteinn · 2024-03-09T16:25:52Z

ping

nikic · 2024-03-09T20:58:11Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+  if (Depth++ >= MaxAnalysisRecursionDepth)
+    return false;
+  Value *X;
+  const Operator *I = dyn_cast<Operator>(V);


Use Instruction instead of Operator here. I don't see a good reason to make this deal with constant expressions.

nikic · 2024-03-09T20:58:25Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+    return false;
+  Value *X;
+  const Operator *I = dyn_cast<Operator>(V);
+  if (I == nullptr)


Suggested change

if (I == nullptr)

if (!I)

I dont think its an issue for llvm, but on systems where nullptr is non-zero, does !I still work?

nikic · 2024-03-09T21:10:10Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp


+  if (!TryMatch(0) && !TryMatch(1))


Can you instead call this fold from foldICmpCommutative() and only handle one direction here?

Done, commit to do so is split into NFC.

nikic

LGTM

nikic · 2024-03-10T08:56:53Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+static Value *foldICmpWithLowBitMaskedVal(ICmpInst::Predicate Pred, Value *Op0,
+                                          Value *Op1, const SimplifyQuery &Q,
+                                          InstCombiner &IC) {
+


Looks like there's an extra newline here.

nikic · 2024-03-10T08:58:58Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+    if (!ICmpInst::isSigned(Pred))
+      return isMaskOrZero(V, Not, Q);
+    return Not ? match(V, m_NegatedPower2OrZero())
+               : match(V, m_LowBitMaskOrZero());


Suggested change

if (!ICmpInst::isSigned(Pred))

return isMaskOrZero(V, Not, Q);

return Not ? match(V, m_NegatedPower2OrZero())

: match(V, m_LowBitMaskOrZero());

if (CmpInst::isSigned(Pred) && !isa<Constant>(V))

return false;

return isMaskOrZero(V, Not, Q);

I think it would be clearer to express it like this?

They seem about the same to me, happy to change.

Differential Revision: https://reviews.llvm.org/D159057

…ive`; NFC

Make recursive matcher that is able to detect a lot more patterns. Proofs for all supported patterns: https://alive2.llvm.org/ce/z/fSQ3nZ Differential Revision: https://reviews.llvm.org/D159058

…foldICmpWithLowBitMaskedVal` `(icmp eq/ne (and X, ~Mask), 0)` is equivilent to `(icmp eq/ne (and X, Mask), X` and we sometimes generate the former pattern intentionally to reduce number of uses of `X`. Proof: https://alive2.llvm.org/ce/z/3u-usC Differential Revision: https://reviews.llvm.org/D159329

goldsteinn requested a review from nikic as a code owner February 13, 2024 01:44

llvmbot added llvm:ir llvm:transforms labels Feb 13, 2024

goldsteinn changed the title ~~goldsteinn/icmp lowbit mask~~ [InstCombine] Improve folding of icmp pred (and X, Mask/~Mask), Y) Feb 13, 2024

goldsteinn requested a review from dtcxzyw February 13, 2024 01:46

AZero13 approved these changes Feb 13, 2024

View reviewed changes

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Feb 13, 2024

pre-commit: test PR81562

27650f3

PR Link: llvm/llvm-project#81562

dtcxzyw mentioned this pull request Feb 13, 2024

pre-commit: test PR81562 dtcxzyw/llvm-opt-benchmark#227

Closed

dtcxzyw reviewed Feb 13, 2024

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved

goldsteinn force-pushed the goldsteinn/icmp-lowbit-mask branch from ea9f6e4 to ec40daf Compare February 13, 2024 05:37

dtcxzyw requested changes Feb 13, 2024

View reviewed changes

dtcxzyw reviewed Feb 13, 2024

View reviewed changes

goldsteinn force-pushed the goldsteinn/icmp-lowbit-mask branch 2 times, most recently from 3041bd3 to 5d13b45 Compare February 13, 2024 19:20

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Feb 22, 2024

pre-commit: test PR81562

944f015

PR Link: llvm/llvm-project#81562

dtcxzyw approved these changes Feb 22, 2024

View reviewed changes

goldsteinn force-pushed the goldsteinn/icmp-lowbit-mask branch from 5d13b45 to 194152e Compare February 29, 2024 17:55

nikic reviewed Mar 9, 2024

View reviewed changes

goldsteinn force-pushed the goldsteinn/icmp-lowbit-mask branch from 194152e to 09399b6 Compare March 10, 2024 00:44

nikic approved these changes Mar 10, 2024

View reviewed changes

goldsteinn added 2 commits March 10, 2024 11:57

[InstCombine] Add tests for expanding foldICmpWithLowBitMaskedVal; NFC

1a3fec9

Differential Revision: https://reviews.llvm.org/D159057

[InstCombine] Move foldICmpWithLowBitMaskedVal to `foldICmpCommutat…

d98b93f

…ive`; NFC

goldsteinn added 2 commits March 10, 2024 12:08

[InstCombine] Improve mask detection in foldICmpWithLowBitMaskedVal

7f8c766

Make recursive matcher that is able to detect a lot more patterns. Proofs for all supported patterns: https://alive2.llvm.org/ce/z/fSQ3nZ Differential Revision: https://reviews.llvm.org/D159058

goldsteinn force-pushed the goldsteinn/icmp-lowbit-mask branch from 09399b6 to ceb1d02 Compare March 10, 2024 17:09

goldsteinn closed this in 193b3d6 Mar 10, 2024

[InstCombine] Improve folding of icmp pred (and X, Mask/~Mask), Y) #81562

[InstCombine] Improve folding of icmp pred (and X, Mask/~Mask), Y) #81562

Uh oh!

Conversation

goldsteinn commented Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AZero13 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

goldsteinn commented Feb 19, 2024

Uh oh!

goldsteinn commented Feb 22, 2024

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

goldsteinn commented Feb 29, 2024

Uh oh!

goldsteinn commented Mar 4, 2024

Uh oh!

goldsteinn commented Mar 9, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[InstCombine] Improve folding of `icmp pred (and X, Mask/~Mask), Y)` #81562

[InstCombine] Improve folding of `icmp pred (and X, Mask/~Mask), Y)` #81562

goldsteinn commented Feb 13, 2024 •

edited

Loading

llvmbot commented Feb 13, 2024 •

edited

Loading

github-actions bot commented Feb 13, 2024 •

edited

Loading