[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. #135817

paulwalker-arm · 2025-04-15T16:58:11Z

We can easily select compare-to-zero instructions without dedicated nodes. The test changes show opportunities that were previous missed because of the redundant complexity.

The global-isel changes are due to isBuildVectorAllZeros not identifying all zero floating point vectors. Despite the use of getAnyConstantSplat, which does work, its result is wrapped inside m_SpecificICst that pushes us down an integer only path. I am new to global-isel so is it safe to assume the result of getAnyConstantSplat can be tested directly?

llvmbot · 2025-04-15T16:58:49Z

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-globalisel

Author: Paul Walker (paulwalker-arm)

Changes

We can easily select compare-to-zero instructions without dedicated nodes. The test changes show opportunities that were previous missed because of the redundant complexity.

The global-isel changes are due to isBuildVectorAllZeros not identifying all zero floating point vectors. Despite the use of getAnyConstantSplat, which does work, its result is wrapped inside m_SpecificICst that pushes us down an integer only path. I am new to global-isel so is it safe to assume the result of getAnyConstantSplat can be tested directly?

Patch is 29.19 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135817.diff

11 Files Affected:

(modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+2-1)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-36)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (-7)
(modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+1-1)
(modified) llvm/lib/Target/AArch64/AArch64InstrGISel.td (-36)
(modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+14-5)
(modified) llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp (+22-47)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir (+45-40)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/select-neon-vector-fcmp.mir (+5-5)
(modified) llvm/test/CodeGen/AArch64/arm64-zip.ll (+1-2)
(modified) llvm/test/CodeGen/AArch64/select_cc.ll (+2-3)

diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index 223d69c362185..d8cc86b34a819 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -1385,7 +1385,8 @@ bool llvm::isBuildVectorConstantSplat(const Register Reg,
                                       const MachineRegisterInfo &MRI,
                                       int64_t SplatValue, bool AllowUndef) {
   if (auto SplatValAndReg = getAnyConstantSplat(Reg, MRI, AllowUndef))
-    return mi_match(SplatValAndReg->VReg, MRI, m_SpecificICst(SplatValue));
+    return SplatValAndReg->Value.getSExtValue() == SplatValue;
+
   return false;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 830ec6886e6bc..eb58dd2b5b1d8 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2592,11 +2592,6 @@ unsigned AArch64TargetLowering::ComputeNumSignBitsForTargetNode(
   case AArch64ISD::FCMEQ:
   case AArch64ISD::FCMGE:
   case AArch64ISD::FCMGT:
-  case AArch64ISD::FCMEQz:
-  case AArch64ISD::FCMGEz:
-  case AArch64ISD::FCMGTz:
-  case AArch64ISD::FCMLEz:
-  case AArch64ISD::FCMLTz:
     // Compares return either 0 or all-ones
     return VTBits;
   case AArch64ISD::VASHR: {
@@ -2813,11 +2808,6 @@ const char *AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
     MAKE_CASE(AArch64ISD::FCMEQ)
     MAKE_CASE(AArch64ISD::FCMGE)
     MAKE_CASE(AArch64ISD::FCMGT)
-    MAKE_CASE(AArch64ISD::FCMEQz)
-    MAKE_CASE(AArch64ISD::FCMGEz)
-    MAKE_CASE(AArch64ISD::FCMGTz)
-    MAKE_CASE(AArch64ISD::FCMLEz)
-    MAKE_CASE(AArch64ISD::FCMLTz)
     MAKE_CASE(AArch64ISD::SADDV)
     MAKE_CASE(AArch64ISD::UADDV)
     MAKE_CASE(AArch64ISD::UADDLV)
@@ -15821,40 +15811,19 @@ static SDValue EmitVectorComparison(SDValue LHS, SDValue RHS,
   assert(VT.getSizeInBits() == SrcVT.getSizeInBits() &&
          "function only supposed to emit natural comparisons");
 
-  APInt SplatValue;
-  APInt SplatUndef;
-  unsigned SplatBitSize = 0;
-  bool HasAnyUndefs;
-
-  BuildVectorSDNode *BVN = dyn_cast<BuildVectorSDNode>(RHS.getNode());
-  bool IsCnst = BVN && BVN->isConstantSplat(SplatValue, SplatUndef,
-                                            SplatBitSize, HasAnyUndefs);
-
-  bool IsZero = IsCnst && SplatValue == 0;
-
   if (SrcVT.getVectorElementType().isFloatingPoint()) {
     switch (CC) {
     default:
       return SDValue();
     case AArch64CC::NE: {
-      SDValue Fcmeq;
-      if (IsZero)
-        Fcmeq = DAG.getNode(AArch64ISD::FCMEQz, dl, VT, LHS);
-      else
-        Fcmeq = DAG.getNode(AArch64ISD::FCMEQ, dl, VT, LHS, RHS);
+      SDValue Fcmeq = DAG.getNode(AArch64ISD::FCMEQ, dl, VT, LHS, RHS);
       return DAG.getNOT(dl, Fcmeq, VT);
     }
     case AArch64CC::EQ:
-      if (IsZero)
-        return DAG.getNode(AArch64ISD::FCMEQz, dl, VT, LHS);
       return DAG.getNode(AArch64ISD::FCMEQ, dl, VT, LHS, RHS);
     case AArch64CC::GE:
-      if (IsZero)
-        return DAG.getNode(AArch64ISD::FCMGEz, dl, VT, LHS);
       return DAG.getNode(AArch64ISD::FCMGE, dl, VT, LHS, RHS);
     case AArch64CC::GT:
-      if (IsZero)
-        return DAG.getNode(AArch64ISD::FCMGTz, dl, VT, LHS);
       return DAG.getNode(AArch64ISD::FCMGT, dl, VT, LHS, RHS);
     case AArch64CC::LE:
       if (!NoNans)
@@ -15862,8 +15831,6 @@ static SDValue EmitVectorComparison(SDValue LHS, SDValue RHS,
       // If we ignore NaNs then we can use to the LS implementation.
       [[fallthrough]];
     case AArch64CC::LS:
-      if (IsZero)
-        return DAG.getNode(AArch64ISD::FCMLEz, dl, VT, LHS);
       return DAG.getNode(AArch64ISD::FCMGE, dl, VT, RHS, LHS);
     case AArch64CC::LT:
       if (!NoNans)
@@ -15871,8 +15838,6 @@ static SDValue EmitVectorComparison(SDValue LHS, SDValue RHS,
       // If we ignore NaNs then we can use to the MI implementation.
       [[fallthrough]];
     case AArch64CC::MI:
-      if (IsZero)
-        return DAG.getNode(AArch64ISD::FCMLTz, dl, VT, LHS);
       return DAG.getNode(AArch64ISD::FCMGT, dl, VT, RHS, LHS);
     }
   }
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 0d51ef2be8631..adbe7e9d0a0f3 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -245,13 +245,6 @@ enum NodeType : unsigned {
   FCMGE,
   FCMGT,
 
-  // Vector zero comparisons
-  FCMEQz,
-  FCMGEz,
-  FCMGTz,
-  FCMLEz,
-  FCMLTz,
-
   // Round wide FP to narrow FP with inexact results to odd.
   FCVTXN,
 
diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
index 9bbcb6f3aedf5..2a0da9a1373ee 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrFormats.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
@@ -7136,7 +7136,7 @@ multiclass SIMDCmpTwoVector<bit U, bits<5> opc, string asm,
 
 // FP Comparisons support only S and D element sizes (and H for v8.2a).
 multiclass SIMDFPCmpTwoVector<bit U, bit S, bits<5> opc,
-                              string asm, SDNode OpNode> {
+                              string asm, SDPatternOperator OpNode> {
 
   let mayRaiseFPException = 1, Uses = [FPCR] in {
   let Predicates = [HasNEON, HasFullFP16] in {
diff --git a/llvm/lib/Target/AArch64/AArch64InstrGISel.td b/llvm/lib/Target/AArch64/AArch64InstrGISel.td
index a99019d72b4ac..7322212c5bb24 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrGISel.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrGISel.td
@@ -179,36 +179,6 @@ def G_FCMGT : AArch64GenericInstruction {
   let hasSideEffects = 0;
 }
 
-def G_FCMEQZ : AArch64GenericInstruction {
-  let OutOperandList = (outs type0:$dst);
-  let InOperandList = (ins type0:$src);
-  let hasSideEffects = 0;
-}
-
-def G_FCMGEZ : AArch64GenericInstruction {
-  let OutOperandList = (outs type0:$dst);
-  let InOperandList = (ins type0:$src);
-  let hasSideEffects = 0;
-}
-
-def G_FCMGTZ : AArch64GenericInstruction {
-  let OutOperandList = (outs type0:$dst);
-  let InOperandList = (ins type0:$src);
-  let hasSideEffects = 0;
-}
-
-def G_FCMLEZ : AArch64GenericInstruction {
-  let OutOperandList = (outs type0:$dst);
-  let InOperandList = (ins type0:$src);
-  let hasSideEffects = 0;
-}
-
-def G_FCMLTZ : AArch64GenericInstruction {
-  let OutOperandList = (outs type0:$dst);
-  let InOperandList = (ins type0:$src);
-  let hasSideEffects = 0;
-}
-
 def G_AARCH64_PREFETCH : AArch64GenericInstruction {
   let OutOperandList = (outs);
   let InOperandList = (ins type0:$imm, ptype0:$src1);
@@ -295,12 +265,6 @@ def : GINodeEquiv<G_FCMEQ, AArch64fcmeq>;
 def : GINodeEquiv<G_FCMGE, AArch64fcmge>;
 def : GINodeEquiv<G_FCMGT, AArch64fcmgt>;
 
-def : GINodeEquiv<G_FCMEQZ, AArch64fcmeqz>;
-def : GINodeEquiv<G_FCMGEZ, AArch64fcmgez>;
-def : GINodeEquiv<G_FCMGTZ, AArch64fcmgtz>;
-def : GINodeEquiv<G_FCMLEZ, AArch64fcmlez>;
-def : GINodeEquiv<G_FCMLTZ, AArch64fcmltz>;
-
 def : GINodeEquiv<G_BSP, AArch64bsp>;
 
 def : GINodeEquiv<G_UMULL, AArch64umull>;
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index b90792d60d102..319d5b0519487 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -876,11 +876,20 @@ def AArch64cmltz : PatFrag<(ops node:$lhs),
 def AArch64cmtst : PatFrag<(ops node:$LHS, node:$RHS),
                            (vnot (AArch64cmeqz (and node:$LHS, node:$RHS)))>;
 
-def AArch64fcmeqz: SDNode<"AArch64ISD::FCMEQz", SDT_AArch64fcmpz>;
-def AArch64fcmgez: SDNode<"AArch64ISD::FCMGEz", SDT_AArch64fcmpz>;
-def AArch64fcmgtz: SDNode<"AArch64ISD::FCMGTz", SDT_AArch64fcmpz>;
-def AArch64fcmlez: SDNode<"AArch64ISD::FCMLEz", SDT_AArch64fcmpz>;
-def AArch64fcmltz: SDNode<"AArch64ISD::FCMLTz", SDT_AArch64fcmpz>;
+def AArch64fcmeqz : PatFrag<(ops node:$lhs),
+                            (AArch64fcmeq node:$lhs, immAllZerosV)>;
+
+def AArch64fcmgez : PatFrag<(ops node:$lhs),
+                            (AArch64fcmge node:$lhs, immAllZerosV)>;
+
+def AArch64fcmgtz : PatFrag<(ops node:$lhs),
+                            (AArch64fcmgt node:$lhs, immAllZerosV)>;
+
+def AArch64fcmlez : PatFrag<(ops node:$lhs),
+                            (AArch64fcmge immAllZerosV, node:$lhs)>;
+
+def AArch64fcmltz : PatFrag<(ops node:$lhs),
+                            (AArch64fcmgt immAllZerosV, node:$lhs)>;
 
 def AArch64fcvtxn_n: SDNode<"AArch64ISD::FCVTXN", SDTFPRoundOp>;
 def AArch64fcvtxnsdr: PatFrags<(ops node:$Rn),
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp b/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
index 4785c7b68d94d..3e242c1b9a114 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
@@ -808,16 +808,14 @@ void applyScalarizeVectorUnmerge(MachineInstr &MI, MachineRegisterInfo &MRI,
 
 bool matchBuildVectorToDup(MachineInstr &MI, MachineRegisterInfo &MRI) {
   assert(MI.getOpcode() == TargetOpcode::G_BUILD_VECTOR);
-  auto Splat = getAArch64VectorSplat(MI, MRI);
-  if (!Splat)
-    return false;
-  if (Splat->isReg())
-    return true;
+
   // Later, during selection, we'll try to match imported patterns using
   // immAllOnesV and immAllZerosV. These require G_BUILD_VECTOR. Don't lower
   // G_BUILD_VECTORs which could match those patterns.
-  int64_t Cst = Splat->getCst();
-  return (Cst != 0 && Cst != -1);
+  if (isBuildVectorAllZeros(MI, MRI) || isBuildVectorAllOnes(MI, MRI))
+    return false;
+
+  return getAArch64VectorSplat(MI, MRI).has_value();
 }
 
 void applyBuildVectorToDup(MachineInstr &MI, MachineRegisterInfo &MRI,
@@ -936,8 +934,8 @@ void applySwapICmpOperands(MachineInstr &MI, GISelChangeObserver &Observer) {
 /// \param [in] IsZero - True if the comparison is against 0.
 /// \param [in] NoNans - True if the target has NoNansFPMath.
 std::function<Register(MachineIRBuilder &)>
-getVectorFCMP(AArch64CC::CondCode CC, Register LHS, Register RHS, bool IsZero,
-              bool NoNans, MachineRegisterInfo &MRI) {
+getVectorFCMP(AArch64CC::CondCode CC, Register LHS, Register RHS, bool NoNans,
+              MachineRegisterInfo &MRI) {
   LLT DstTy = MRI.getType(LHS);
   assert(DstTy.isVector() && "Expected vector types only?");
   assert(DstTy == MRI.getType(RHS) && "Src and Dst types must match!");
@@ -945,46 +943,29 @@ getVectorFCMP(AArch64CC::CondCode CC, Register LHS, Register RHS, bool IsZero,
   default:
     llvm_unreachable("Unexpected condition code!");
   case AArch64CC::NE:
-    return [LHS, RHS, IsZero, DstTy](MachineIRBuilder &MIB) {
-      auto FCmp = IsZero
-                      ? MIB.buildInstr(AArch64::G_FCMEQZ, {DstTy}, {LHS})
-                      : MIB.buildInstr(AArch64::G_FCMEQ, {DstTy}, {LHS, RHS});
+    return [LHS, RHS, DstTy](MachineIRBuilder &MIB) {
+      auto FCmp = MIB.buildInstr(AArch64::G_FCMEQ, {DstTy}, {LHS, RHS});
       return MIB.buildNot(DstTy, FCmp).getReg(0);
     };
   case AArch64CC::EQ:
-    return [LHS, RHS, IsZero, DstTy](MachineIRBuilder &MIB) {
-      return IsZero
-                 ? MIB.buildInstr(AArch64::G_FCMEQZ, {DstTy}, {LHS}).getReg(0)
-                 : MIB.buildInstr(AArch64::G_FCMEQ, {DstTy}, {LHS, RHS})
-                       .getReg(0);
+    return [LHS, RHS, DstTy](MachineIRBuilder &MIB) {
+      return MIB.buildInstr(AArch64::G_FCMEQ, {DstTy}, {LHS, RHS}).getReg(0);
     };
   case AArch64CC::GE:
-    return [LHS, RHS, IsZero, DstTy](MachineIRBuilder &MIB) {
-      return IsZero
-                 ? MIB.buildInstr(AArch64::G_FCMGEZ, {DstTy}, {LHS}).getReg(0)
-                 : MIB.buildInstr(AArch64::G_FCMGE, {DstTy}, {LHS, RHS})
-                       .getReg(0);
+    return [LHS, RHS, DstTy](MachineIRBuilder &MIB) {
+      return MIB.buildInstr(AArch64::G_FCMGE, {DstTy}, {LHS, RHS}).getReg(0);
     };
   case AArch64CC::GT:
-    return [LHS, RHS, IsZero, DstTy](MachineIRBuilder &MIB) {
-      return IsZero
-                 ? MIB.buildInstr(AArch64::G_FCMGTZ, {DstTy}, {LHS}).getReg(0)
-                 : MIB.buildInstr(AArch64::G_FCMGT, {DstTy}, {LHS, RHS})
-                       .getReg(0);
+    return [LHS, RHS, DstTy](MachineIRBuilder &MIB) {
+      return MIB.buildInstr(AArch64::G_FCMGT, {DstTy}, {LHS, RHS}).getReg(0);
     };
   case AArch64CC::LS:
-    return [LHS, RHS, IsZero, DstTy](MachineIRBuilder &MIB) {
-      return IsZero
-                 ? MIB.buildInstr(AArch64::G_FCMLEZ, {DstTy}, {LHS}).getReg(0)
-                 : MIB.buildInstr(AArch64::G_FCMGE, {DstTy}, {RHS, LHS})
-                       .getReg(0);
+    return [LHS, RHS, DstTy](MachineIRBuilder &MIB) {
+      return MIB.buildInstr(AArch64::G_FCMGE, {DstTy}, {RHS, LHS}).getReg(0);
     };
   case AArch64CC::MI:
-    return [LHS, RHS, IsZero, DstTy](MachineIRBuilder &MIB) {
-      return IsZero
-                 ? MIB.buildInstr(AArch64::G_FCMLTZ, {DstTy}, {LHS}).getReg(0)
-                 : MIB.buildInstr(AArch64::G_FCMGT, {DstTy}, {RHS, LHS})
-                       .getReg(0);
+    return [LHS, RHS, DstTy](MachineIRBuilder &MIB) {
+      return MIB.buildInstr(AArch64::G_FCMGT, {DstTy}, {RHS, LHS}).getReg(0);
     };
   }
 }
@@ -1024,23 +1005,17 @@ void applyLowerVectorFCMP(MachineInstr &MI, MachineRegisterInfo &MRI,
 
   LLT DstTy = MRI.getType(Dst);
 
-  auto Splat = getAArch64VectorSplat(*MRI.getVRegDef(RHS), MRI);
-
-  // Compares against 0 have special target-specific pseudos.
-  bool IsZero = Splat && Splat->isCst() && Splat->getCst() == 0;
-
   bool Invert = false;
   AArch64CC::CondCode CC, CC2 = AArch64CC::AL;
   if ((Pred == CmpInst::Predicate::FCMP_ORD ||
        Pred == CmpInst::Predicate::FCMP_UNO) &&
-      IsZero) {
+      isBuildVectorAllZeros(*MRI.getVRegDef(RHS), MRI)) {
     // The special case "fcmp ord %a, 0" is the canonical check that LHS isn't
     // NaN, so equivalent to a == a and doesn't need the two comparisons an
     // "ord" normally would.
     // Similarly, "fcmp uno %a, 0" is the canonical check that LHS is NaN and is
     // thus equivalent to a != a.
     RHS = LHS;
-    IsZero = false;
     CC = Pred == CmpInst::Predicate::FCMP_ORD ? AArch64CC::EQ : AArch64CC::NE;
   } else
     changeVectorFCMPPredToAArch64CC(Pred, CC, CC2, Invert);
@@ -1051,12 +1026,12 @@ void applyLowerVectorFCMP(MachineInstr &MI, MachineRegisterInfo &MRI,
   const bool NoNans =
       ST.getTargetLowering()->getTargetMachine().Options.NoNaNsFPMath;
 
-  auto Cmp = getVectorFCMP(CC, LHS, RHS, IsZero, NoNans, MRI);
+  auto Cmp = getVectorFCMP(CC, LHS, RHS, NoNans, MRI);
   Register CmpRes;
   if (CC2 == AArch64CC::AL)
     CmpRes = Cmp(MIB);
   else {
-    auto Cmp2 = getVectorFCMP(CC2, LHS, RHS, IsZero, NoNans, MRI);
+    auto Cmp2 = getVectorFCMP(CC2, LHS, RHS, NoNans, MRI);
     auto Cmp2Dst = Cmp2(MIB);
     auto Cmp1Dst = Cmp(MIB);
     CmpRes = MIB.buildOr(DstTy, Cmp1Dst, Cmp2Dst).getReg(0);
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir b/llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir
index 1f5fb892df582..afc4e9eac7547 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir
@@ -31,14 +31,15 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Should be inverted. Needs two compares.
 
     ; CHECK-LABEL: name: oeq_zero
     ; CHECK: liveins: $q0, $q1
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %lhs:_(<2 x s64>) = COPY $q0
-    ; CHECK-NEXT: [[FCMEQZ:%[0-9]+]]:_(<2 x s64>) = G_FCMEQZ %lhs
-    ; CHECK-NEXT: $q0 = COPY [[FCMEQZ]](<2 x s64>)
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %zero_vec:_(<2 x s64>) = G_BUILD_VECTOR %zero(s64), %zero(s64)
+    ; CHECK-NEXT: [[FCMEQ:%[0-9]+]]:_(<2 x s64>) = G_FCMEQ %lhs, %zero_vec(<2 x s64>)
+    ; CHECK-NEXT: $q0 = COPY [[FCMEQ]](<2 x s64>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
     %lhs:_(<2 x s64>) = COPY $q0
     %zero:_(s64) = G_CONSTANT i64 0
@@ -82,8 +83,10 @@ body:             |
     ; CHECK: liveins: $q0, $q1
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %lhs:_(<2 x s64>) = COPY $q0
-    ; CHECK-NEXT: [[FCMGTZ:%[0-9]+]]:_(<2 x s64>) = G_FCMGTZ %lhs
-    ; CHECK-NEXT: $q0 = COPY [[FCMGTZ]](<2 x s64>)
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %zero_vec:_(<2 x s64>) = G_BUILD_VECTOR %zero(s64), %zero(s64)
+    ; CHECK-NEXT: [[FCMGT:%[0-9]+]]:_(<2 x s64>) = G_FCMGT %lhs, %zero_vec(<2 x s64>)
+    ; CHECK-NEXT: $q0 = COPY [[FCMGT]](<2 x s64>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
     %lhs:_(<2 x s64>) = COPY $q0
     %zero:_(s64) = G_CONSTANT i64 0
@@ -123,14 +126,15 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Should be inverted. Needs two compares.
 
     ; CHECK-LABEL: name: oge_zero
     ; CHECK: liveins: $q0, $q1
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %lhs:_(<2 x s64>) = COPY $q0
-    ; CHECK-NEXT: [[FCMGEZ:%[0-9]+]]:_(<2 x s64>) = G_FCMGEZ %lhs
-    ; CHECK-NEXT: $q0 = COPY [[FCMGEZ]](<2 x s64>)
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %zero_vec:_(<2 x s64>) = G_BUILD_VECTOR %zero(s64), %zero(s64)
+    ; CHECK-NEXT: [[FCMGE:%[0-9]+]]:_(<2 x s64>) = G_FCMGE %lhs, %zero_vec(<2 x s64>)
+    ; CHECK-NEXT: $q0 = COPY [[FCMGE]](<2 x s64>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
     %lhs:_(<2 x s64>) = COPY $q0
     %zero:_(s64) = G_CONSTANT i64 0
@@ -174,8 +178,10 @@ body:             |
     ; CHECK: liveins: $q0, $q1
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %lhs:_(<2 x s64>) = COPY $q0
-    ; CHECK-NEXT: [[FCMLTZ:%[0-9]+]]:_(<2 x s64>) = G_FCMLTZ %lhs
-    ; CHECK-NEXT: $q0 = COPY [[FCMLTZ]](<2 x s64>)
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %zero_vec:_(<2 x s64>) = G_BUILD_VECTOR %zero(s64), %zero(s64)
+    ; CHECK-NEXT: [[FCMGT:%[0-9]+]]:_(<2 x s64>) = G_FCMGT %zero_vec, %lhs(<2 x s64>)
+    ; CHECK-NEXT: $q0 = COPY [[FCMGT]](<2 x s64>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
     %lhs:_(<2 x s64>) = COPY $q0
     %zero:_(s64) = G_CONSTANT i64 0
@@ -218,8 +224,10 @@ body:             |
     ; CHECK: liveins: $q0, $q1
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %lhs:_(<2 x s64>) = COPY $q0
-    ; CHECK-NEXT: [[FCMLEZ:%[0-9]+]]:_(<2 x s64>) = G_FCMLEZ %lhs
-    ; CHECK-NEXT: $q0 = COPY [[FCMLEZ]](<2 x s64>)
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %zero_vec:_(<2 x s64>) = G_BUILD_VECTOR %zero(s64), %zero(s64)
+    ; CHECK-NEXT: [[FCMGE:%[0-9]+]]:_(<2 x s64>) = G_FCMGE %zero_vec, %lhs(<2 x s64>)
+    ; CHECK-NEXT: $q0 = COPY [[FCMGE]](<2 x s64>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
     %lhs:_(<2 x s64>) = COPY $q0
     %zero:_(s64) = G_CONSTANT i64 0
@@ -237,7 +245,6 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Two compares.
 
     ; CHECK-LABEL: name: one
     ; CHECK: liveins: $q0, $q1
@@ -264,15 +271,16 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Two compares.
 
     ; CHECK-LABEL: name: one_zero
     ; CHECK: liveins: $q0, $q1
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %lhs:_(<2 x s64>) = COPY $q0
-    ; CHECK-NEXT: [[FCMGTZ:%[0-9]+]]:_(<2 x s64>) = G_FCMGTZ %lhs
-    ; CHECK-NEXT: [[FCMLTZ:%[0-9]+]]:_(<2 x s64>) = G_FCMLTZ %lhs
-    ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[FCMLTZ]], [[FCMGTZ]]
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %zero_vec:_(<2 x s64>) = G_BUILD_VECTOR %zero(s64), %zero(s64)
+    ; CHECK-NEXT: [[FCMGT:%[0-9]+]]:_(<2 x s64>) = G_FCMGT %lhs, %zero_vec(<2 x s64>)
+    ; CHECK-NEXT: [[FCMGT1:%[0-9]+]]:_(<2 x s64>) = G_FCMGT %zero_vec, %lhs(<2 x s64>)
+    ; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s64>) = G_OR [[FCMGT1]], [[FCMGT]]
     ; CHECK-NEXT: $q0 = COPY [[OR]](<2 x s64>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
     %lhs:_(<2 x s64>) = COPY $q0
@@ -291,7 +299,6 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Should be inverted. Needs two compares.
 
     ; CHECK-LABEL: name: uno
     ; CHECK: liveins: $q0, $q1
@@ -348,7 +355,6 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Needs two compares. No invert.
 
     ; CHECK-LABEL: name: ord
     ; CHECK: liveins: $q0, $q1
@@ -375,7 +381,6 @@ body:             |
   bb.0:
     liveins: $q0, $q1
 
-    ; Needs two compares. No invert.
 
     ; CHECK-LABEL: name: ord_zero
     ; CHECK: liv...
[truncated]

llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir

We can easily select compare-to-zero instructions without dedicated nodes. The test changes show opportunities that were previous missed because of the redundant complexity. The global-isel changes are due to isBuildVectorAllZeros not identifying all zero floating point vectors. Despite the use of getAnyConstantSplat, which does work, its result is wrapped inside m_SpecificICst that pushes us down an integer only path. I am new to global-isel so is it safe to assume the result of getAnyConstantSplat can be tested directly?

aemerson

Seems like a nice clean up. For at least the GISel part LGTM.

aemerson · 2025-04-20T06:58:38Z

llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp

@@ -936,55 +934,38 @@ void applySwapICmpOperands(MachineInstr &MI, GISelChangeObserver &Observer) {
 /// \param [in] IsZero - True if the comparison is against 0.


Obsolete comment line

gulfemsavrun · 2025-04-23T18:01:34Z

We started seeing an assertion failure, and I bisected it to this commit:

60507/168895](64) CC obj/third_party/github.com/google/liblc3/src/src/lc3_codec.mdct.c.o
FAILED: [code=1] obj/third_party/github.com/google/liblc3/src/src/lc3_codec.mdct.c.o 
../../prebuilt/third_party/clang/custom/bin/clang -MD -MF obj/third_party/github.com/google/liblc3/src/src/lc3_codec.mdct.c.o.d -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCPP_REMOVE_TRANSITIVE_INCLUDES -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS=1 -DZX_ASSERT_LEVEL=2 -I../.. -Igen -I../../third_party/github.com/google/liblc3/src/include -fcolor-diagnostics -fcrash-diagnostics-dir=clang-crashreports -fcrash-diagnostics=all -gen-reproducer=error -ffp-contract=off --sysroot=gen/zircon/public/sysroot --target=aarch64-unknown-fuchsia -ffuchsia-api-level=4293918720 -march=armv8-a+simd+crc+crypto -mtune=generic -ffile-compilation-dir=. -no-canonical-prefixes -fno-omit-frame-pointer -momit-leaf-frame-pointer -fdata-sections -ffunction-sections -O0 -Xclang -debug-info-kind=constructor -g3 -grecord-gcc-switches -gdwarf-5 -gz=zstd -Wall -Wextra -Wconversion -Wextra-semi -Wimplicit-fallthrough -Wnewline-eof -Wstrict-prototypes -Wwrite-strings -Wno-sign-conversion -Wno-unused-parameter -Wnonportable-system-include-path -Wno-missing-field-initializers -Wno-extra-qualification -Wno-cast-function-type-strict -Wno-cast-function-type-mismatch -Wno-unknown-warning-option -Wno-missing-template-arg-list-after-template-kw -Wno-deprecated-pragma -Wno-nontrivial-memaccess -fvisibility=hidden -Werror -Wa,--fatal-warnings -ftrivial-auto-var-init=pattern -Wthread-safety -Wno-unknown-warning-option -Wno-thread-safety-reference-return -Wno-conversion -std=c11 -c ../../third_party/github.com/google/liblc3/src/src/mdct.c -o obj/third_party/github.com/google/liblc3/src/src/lc3_codec.mdct.c.o
clang: llvm/lib/IR/Constants.cpp:1783: const APInt &llvm::Constant::getUniqueInteger() const: Assertion `C && isa<ConstantInt>(C) && "Not a vector of numbers!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: ../../prebuilt/third_party/clang/custom/bin/clang -MD -MF obj/third_party/github.com/google/liblc3/src/src/lc3_codec.mdct.c.o.d -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCPP_REMOVE_TRANSITIVE_INCLUDES -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS=1 -DZX_ASSERT_LEVEL=2 -I../.. -Igen -I../../third_party/github.com/google/liblc3/src/include -fcolor-diagnostics -fcrash-diagnostics-dir=clang-crashreports -fcrash-diagnostics=all -gen-reproducer=error -ffp-contract=off --sysroot=gen/zircon/public/sysroot --target=aarch64-unknown-fuchsia -ffuchsia-api-level=4293918720 -march=armv8-a+simd+crc+crypto -mtune=generic -ffile-compilation-dir=. -no-canonical-prefixes -fno-omit-frame-pointer -momit-leaf-frame-pointer -fdata-sections -ffunction-sections -O0 -Xclang -debug-info-kind=constructor -g3 -grecord-gcc-switches -gdwarf-5 -gz=zstd -Wall -Wextra -Wconversion -Wextra-semi -Wimplicit-fallthrough -Wnewline-eof -Wstrict-prototypes -Wwrite-strings -Wno-sign-conversion -Wno-unused-parameter -Wnonportable-system-include-path -Wno-missing-field-initializers -Wno-extra-qualification -Wno-cast-function-type-strict -Wno-cast-function-type-mismatch -Wno-unknown-warning-option -Wno-missing-template-arg-list-after-template-kw -Wno-deprecated-pragma -Wno-nontrivial-memaccess -fvisibility=hidden -Werror -Wa,--fatal-warnings -ftrivial-auto-var-init=pattern -Wthread-safety -Wno-unknown-warning-option -Wno-thread-safety-reference-return -Wno-conversion -std=c11 -c ../../third_party/github.com/google/liblc3/src/src/mdct.c -o obj/third_party/github.com/google/liblc3/src/src/lc3_codec.mdct.c.o
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '../../third_party/github.com/google/liblc3/src/src/mdct.c'.
4.	Running pass 'InstructionSelect' on function '@neon_fft_5'
#0 0x0000561163975c78 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (../../prebuilt/third_party/clang/custom/bin/clang+0x91bcc78)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
Fuchsia clang version 21.0.0git (https://llvm.googlesource.com/llvm-project 2a9f77f6bd48d757b2d45aadcb6cf76ef4b4ef32)
Target: aarch64-unknown-fuchsia
Thread model: posix
InstalledDir: ../../prebuilt/third_party/clang/custom/bin
Build config: +assertions
clang: note: diagnostic msg: 
********************

https://luci-milo.appspot.com/ui/p/fuchsia/builders/global.ci/clang_toolchain.ci.core.arm64-debug/b8716800273666175553/overview

gulfemsavrun · 2025-04-23T18:09:00Z

I'm attaching a reproducer:
builds_8716800273666175553_build-debug_clang-crashreports_mdct-116f6d.zip

This reverts commit 15d8b3c.

paulwalker-arm · 2025-04-24T09:55:09Z

@gulfemsavrun: Sorry for the failure and thanks for the reproducer. I've confirmed my PR is responsible and have landed a revert whilst I investigate further.

…)" This reverts commit 427b644. Original patch has been updated to include a fix to esnure AArch64InstructionSelector::emitConstantVector supports all the cases where isBuildVectorAllOnes returns true.

paulwalker-arm · 2025-04-24T13:01:51Z

I've landed a fixed version, turns out emitConstantVector did not support floating-point vector constants.

gulfemsavrun · 2025-04-24T15:50:46Z

I've landed a fixed version, turns out emitConstantVector did not support floating-point vector constants.

Thanks for the fix, and I verified that your fix resolved the issue.

We can easily select compare-to-zero instructions without dedicated nodes. The test changes show opportunities that were previous missed because of the redundant complexity.

…35817)" This reverts commit 15d8b3c.

…135817)" This reverts commit 427b644. Original patch has been updated to include a fix to esnure AArch64InstructionSelector::emitConstantVector supports all the cases where isBuildVectorAllOnes returns true.

We can easily select compare-to-zero instructions without dedicated nodes. The test changes show opportunities that were previous missed because of the redundant complexity.

…35817)" This reverts commit 15d8b3c.

…135817)" This reverts commit 427b644. Original patch has been updated to include a fix to esnure AArch64InstructionSelector::emitConstantVector supports all the cases where isBuildVectorAllOnes returns true.

We can easily select compare-to-zero instructions without dedicated nodes. The test changes show opportunities that were previous missed because of the redundant complexity.

…35817)" This reverts commit 15d8b3c.

…135817)" This reverts commit 427b644. Original patch has been updated to include a fix to esnure AArch64InstructionSelector::emitConstantVector supports all the cases where isBuildVectorAllOnes returns true.

…35817)" This reverts commit 15d8b3c.

…135817)" This reverts commit 427b644. Original patch has been updated to include a fix to esnure AArch64InstructionSelector::emitConstantVector supports all the cases where isBuildVectorAllOnes returns true.

llvmbot added backend:AArch64 llvm:globalisel labels Apr 15, 2025

paulwalker-arm requested review from aemerson and davemgreen April 15, 2025 17:22

arsenm reviewed Apr 17, 2025

View reviewed changes

llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir Outdated Show resolved Hide resolved

paulwalker-arm force-pushed the neon-setcc-remove-fcmp-zero-nodes branch from b296e09 to 447b338 Compare April 18, 2025 13:58

aemerson approved these changes Apr 20, 2025

View reviewed changes

Remove redundant IsZero comment.

89b1c80

paulwalker-arm merged commit 15d8b3c into llvm:main Apr 23, 2025
11 checks passed

paulwalker-arm deleted the neon-setcc-remove-fcmp-zero-nodes branch April 23, 2025 10:29

paulwalker-arm added a commit that referenced this pull request Apr 24, 2025

Revert "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (#135817)"

427b644

This reverts commit 15d8b3c.

IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025

Revert "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (llvm#1…

1f9b5af

…35817)" This reverts commit 15d8b3c.

IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025

Revert "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (llvm#1…

9921028

…35817)" This reverts commit 15d8b3c.

IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025

Revert "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (llvm#1…

4392adc

…35817)" This reverts commit 15d8b3c.

Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025

Revert "[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. (llvm#1…

ecdff90

…35817)" This reverts commit 15d8b3c.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. #135817

[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. #135817

Uh oh!

paulwalker-arm commented Apr 15, 2025

Uh oh!

llvmbot commented Apr 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

aemerson left a comment

Uh oh!

aemerson Apr 20, 2025

Uh oh!

Uh oh!

gulfemsavrun commented Apr 23, 2025

Uh oh!

gulfemsavrun commented Apr 23, 2025

Uh oh!

paulwalker-arm commented Apr 24, 2025

Uh oh!

paulwalker-arm commented Apr 24, 2025

Uh oh!

gulfemsavrun commented Apr 24, 2025

Uh oh!

Uh oh!

		@@ -936,55 +934,38 @@ void applySwapICmpOperands(MachineInstr &MI, GISelChangeObserver &Observer) {
		/// \param [in] IsZero - True if the comparison is against 0.

[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. #135817

[LLVM][ISel][AArch64 Remove AArch64ISD::FCM##z nodes. #135817

Uh oh!

Conversation

paulwalker-arm commented Apr 15, 2025

Uh oh!

llvmbot commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

aemerson left a comment

Choose a reason for hiding this comment

Uh oh!

aemerson Apr 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gulfemsavrun commented Apr 23, 2025

Uh oh!

gulfemsavrun commented Apr 23, 2025

Uh oh!

paulwalker-arm commented Apr 24, 2025

Uh oh!

paulwalker-arm commented Apr 24, 2025

Uh oh!

gulfemsavrun commented Apr 24, 2025

Uh oh!

Uh oh!

llvmbot commented Apr 15, 2025 •

edited

Loading