[InstCombine] Add folds for `(fp_binop ({s|u}itofp x), ({s|u}itofp y))` #82555

goldsteinn · 2024-02-22T00:16:23Z

The full fold is one of the following:

(fp_binop ({s|u}itofp x), ({s|u}itofp y))
-> ({s|u}itofp (int_binop x, y))
(fp_binop ({s|u}itofp x), FpC)
-> ({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))

And support the following binops:
fmul -> mul
fadd -> add
fsub -> sub

Proofs: https://alive2.llvm.org/ce/z/zuacA8

The proofs timeout, so they must be reproduced locally.

llvmbot · 2024-02-22T00:16:57Z

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes

[InstCombine] Add tests for folding `(fp_binop ({s|u}itofp x)
[InstCombine] Move folding `(add (sitofp x)
[InstCombine] Add folds for `(fp_binop ({s|u}itofp x)

Patch is 49.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82555.diff

7 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (+6-57)
(modified) llvm/lib/Transforms/InstCombine/InstCombineInternal.h (+1)
(modified) llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp (+3)
(modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+165)
(modified) llvm/test/Transforms/InstCombine/add-sitofp.ll (+10-10)
(added) llvm/test/Transforms/InstCombine/binop-itofp.ll (+998)
(modified) llvm/test/Transforms/InstCombine/pr33453.ll (+1-3)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index cfff5df9ff5005..f0ca83bb225f1f 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -1867,64 +1867,10 @@ Instruction *InstCombinerImpl::visitFAdd(BinaryOperator &I) {
 
   // Check for (fadd double (sitofp x), y), see if we can merge this into an
   // integer add followed by a promotion.
-  Value *LHS = I.getOperand(0), *RHS = I.getOperand(1);
-  if (SIToFPInst *LHSConv = dyn_cast<SIToFPInst>(LHS)) {
-    Value *LHSIntVal = LHSConv->getOperand(0);
-    Type *FPType = LHSConv->getType();
-
-    // TODO: This check is overly conservative. In many cases known bits
-    // analysis can tell us that the result of the addition has less significant
-    // bits than the integer type can hold.
-    auto IsValidPromotion = [](Type *FTy, Type *ITy) {
-      Type *FScalarTy = FTy->getScalarType();
-      Type *IScalarTy = ITy->getScalarType();
-
-      // Do we have enough bits in the significand to represent the result of
-      // the integer addition?
-      unsigned MaxRepresentableBits =
-          APFloat::semanticsPrecision(FScalarTy->getFltSemantics());
-      return IScalarTy->getIntegerBitWidth() <= MaxRepresentableBits;
-    };
-
-    // (fadd double (sitofp x), fpcst) --> (sitofp (add int x, intcst))
-    // ... if the constant fits in the integer value.  This is useful for things
-    // like (double)(x & 1234) + 4.0 -> (double)((X & 1234)+4) which no longer
-    // requires a constant pool load, and generally allows the add to be better
-    // instcombined.
-    if (ConstantFP *CFP = dyn_cast<ConstantFP>(RHS))
-      if (IsValidPromotion(FPType, LHSIntVal->getType())) {
-        Constant *CI = ConstantFoldCastOperand(Instruction::FPToSI, CFP,
-                                               LHSIntVal->getType(), DL);
-        if (LHSConv->hasOneUse() &&
-            ConstantFoldCastOperand(Instruction::SIToFP, CI, I.getType(), DL) ==
-                CFP &&
-            willNotOverflowSignedAdd(LHSIntVal, CI, I)) {
-          // Insert the new integer add.
-          Value *NewAdd = Builder.CreateNSWAdd(LHSIntVal, CI, "addconv");
-          return new SIToFPInst(NewAdd, I.getType());
-        }
-      }
-
-    // (fadd double (sitofp x), (sitofp y)) --> (sitofp (add int x, y))
-    if (SIToFPInst *RHSConv = dyn_cast<SIToFPInst>(RHS)) {
-      Value *RHSIntVal = RHSConv->getOperand(0);
-      // It's enough to check LHS types only because we require int types to
-      // be the same for this transform.
-      if (IsValidPromotion(FPType, LHSIntVal->getType())) {
-        // Only do this if x/y have the same type, if at least one of them has a
-        // single use (so we don't increase the number of int->fp conversions),
-        // and if the integer add will not overflow.
-        if (LHSIntVal->getType() == RHSIntVal->getType() &&
-            (LHSConv->hasOneUse() || RHSConv->hasOneUse()) &&
-            willNotOverflowSignedAdd(LHSIntVal, RHSIntVal, I)) {
-          // Insert the new integer add.
-          Value *NewAdd = Builder.CreateNSWAdd(LHSIntVal, RHSIntVal, "addconv");
-          return new SIToFPInst(NewAdd, I.getType());
-        }
-      }
-    }
-  }
+  if (Instruction *R = foldFBinOpOfIntCasts(I))
+    return R;
 
+  Value *LHS = I.getOperand(0), *RHS = I.getOperand(1);
   // Handle specials cases for FAdd with selects feeding the operation
   if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS))
     return replaceInstUsesWith(I, V);
@@ -2832,6 +2778,9 @@ Instruction *InstCombinerImpl::visitFSub(BinaryOperator &I) {
   if (Instruction *X = foldFNegIntoConstant(I, DL))
     return X;
 
+  if (Instruction *R = foldFBinOpOfIntCasts(I))
+    return R;
+
   Value *X, *Y;
   Constant *C;
 
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
index 0b4283bc37650a..57148d719d9b61 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+++ b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
@@ -379,6 +379,7 @@ class LLVM_LIBRARY_VISIBILITY InstCombinerImpl final
   Instruction *scalarizePHI(ExtractElementInst &EI, PHINode *PN);
   Instruction *foldBitcastExtElt(ExtractElementInst &ExtElt);
   Instruction *foldCastedBitwiseLogic(BinaryOperator &I);
+  Instruction *foldFBinOpOfIntCasts(BinaryOperator &I);
   Instruction *foldBinopOfSextBoolToSelect(BinaryOperator &I);
   Instruction *narrowBinOp(TruncInst &Trunc);
   Instruction *narrowMaskedBinOp(BinaryOperator &And);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
index 0bd4b6d1a835af..3ebf6b3d9bf7f7 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -769,6 +769,9 @@ Instruction *InstCombinerImpl::visitFMul(BinaryOperator &I) {
   if (Instruction *R = foldFPSignBitOps(I))
     return R;
 
+  if (Instruction *R = foldFBinOpOfIntCasts(I))
+    return R;
+
   // X * -1.0 --> -X
   Value *Op0 = I.getOperand(0), *Op1 = I.getOperand(1);
   if (match(Op1, m_SpecificFP(-1.0)))
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 4af455c37c788c..e5bbf59151bf9a 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -1401,6 +1401,171 @@ Value *InstCombinerImpl::dyn_castNegVal(Value *V) const {
   return nullptr;
 }
 
+// Try to fold:
+//    1) (fp_binop ({s|u}itofp x), ({s|u}itofp y))
+//        -> ({s|u}itofp (int_binop x, y))
+//    2) (fp_binop ({s|u}itofp x), FpC)
+//        -> ({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))
+Instruction *InstCombinerImpl::foldFBinOpOfIntCasts(BinaryOperator &BO) {
+  Value *IntOps[2];
+  Constant *Op1FpC = nullptr;
+
+  // Check for:
+  //    1) (binop ({s|u}itofp x), ({s|u}itofp y))
+  //    2) (binop ({s|u}itofp x), FpC)
+  if (!match(BO.getOperand(0), m_SIToFP(m_Value(IntOps[0]))) &&
+      !match(BO.getOperand(0), m_UIToFP(m_Value(IntOps[0]))))
+    return nullptr;
+
+  if (!match(BO.getOperand(1), m_Constant(Op1FpC)) &&
+      !match(BO.getOperand(1), m_SIToFP(m_Value(IntOps[1]))) &&
+      !match(BO.getOperand(1), m_UIToFP(m_Value(IntOps[1]))))
+    return nullptr;
+
+
+  Type *FPTy = BO.getType();
+  Type *IntTy = IntOps[0]->getType();
+
+  // Do we have signed casts?
+  bool OpsFromSigned = isa<SIToFPInst>(BO.getOperand(0));
+
+
+  unsigned IntSz = IntTy->getScalarSizeInBits();
+  // This is the maximum number of inuse bits by the integer where the int -> fp
+  // casts are exact.
+  unsigned MaxRepresentableBits =
+      APFloat::semanticsPrecision(FPTy->getScalarType()->getFltSemantics());
+
+  // Cache KnownBits a bit to potentially save some analysis.
+  std::optional<KnownBits> OpsKnown[2];
+
+  // Preserve known number of leading bits. This can allow us to trivial nsw/nuw
+  // checks later on.
+  unsigned NumUsedLeadingBits[2] = {IntSz, IntSz};
+
+  auto IsNonZero = [&](unsigned OpNo) -> bool {
+    if (OpsKnown[OpNo].has_value() && OpsKnown[OpNo]->isNonZero())
+      return true;
+    return isKnownNonZero(IntOps[OpNo], SQ.DL);
+  };
+
+  auto IsNonNeg = [&](unsigned OpNo) -> bool {
+    if (OpsKnown[OpNo].has_value() && OpsKnown[OpNo]->isNonNegative())
+      return true;
+    return isKnownNonNegative(IntOps[OpNo], SQ);
+  };
+
+  // Check if we know for certain that ({s|u}itofp op) is exact.
+  auto IsValidPromotion = [&](unsigned OpNo) -> bool {
+    // If fp precision >= bitwidth(op) then its exact.
+    if (MaxRepresentableBits >= IntSz)
+      ;
+    // Otherwise if its signed cast check that fp precisions >= bitwidth(op) -
+    // numSignBits(op).
+    else if (OpsFromSigned)
+      NumUsedLeadingBits[OpNo] = IntSz - ComputeNumSignBits(IntOps[OpNo]);
+    // Finally for unsigned check that fp precision >= bitwidth(op) -
+    // numLeadingZeros(op).
+    else {
+      if (!OpsKnown[OpNo].has_value())
+        OpsKnown[OpNo] = computeKnownBits(IntOps[OpNo], /*Depth*/ 0, &BO);
+      NumUsedLeadingBits[OpNo] = IntSz - OpsKnown[OpNo]->countMinLeadingZeros();
+    }
+    // NB: We could also check if op is known to be a power of 2 or zero (which
+    // will always be representable). Its unlikely, however, that is we are
+    // unable to bound op in any way we will be able to pass the overflow checks
+    // later on.
+
+    if (MaxRepresentableBits < NumUsedLeadingBits[OpNo])
+      return false;
+    // Signed + Mul also requires that op is non-zero to avoid -0 cases.
+    return (OpsFromSigned && BO.getOpcode() == Instruction::FMul)
+               ? IsNonZero(OpNo)
+               : true;
+
+  };
+
+  // If we have a constant rhs, see if we can losslessly convert it to an int.
+  if (Op1FpC != nullptr) {
+    Constant *Op1IntC = ConstantFoldCastOperand(
+        OpsFromSigned ? Instruction::FPToSI : Instruction::FPToUI, Op1FpC,
+        IntTy, DL);
+    if (Op1IntC == nullptr)
+      return nullptr;
+    if (ConstantFoldCastOperand(OpsFromSigned ? Instruction::SIToFP
+                                              : Instruction::UIToFP,
+                                Op1IntC, FPTy, DL) != Op1FpC)
+      return nullptr;
+
+    // First try to keep sign of cast the same.
+    IntOps[1] = Op1IntC;
+  }
+
+  // Ensure lhs/rhs integer types match.
+  if (IntTy != IntOps[1]->getType())
+    return nullptr;
+
+
+  if (Op1FpC == nullptr) {
+    if (OpsFromSigned != isa<SIToFPInst>(BO.getOperand(1))) {
+      // If we have a signed + unsigned, see if we can treat both as signed
+      // (uitofp nneg x) == (sitofp nneg x).
+      if (OpsFromSigned ? !IsNonNeg(1) : !IsNonNeg(0))
+        return nullptr;
+      OpsFromSigned = true;
+    }
+    if (!IsValidPromotion(1))
+      return nullptr;
+  }
+  if (!IsValidPromotion(0))
+    return nullptr;
+
+  // Final we check if the integer version of the binop will not overflow.
+  BinaryOperator::BinaryOps IntOpc;
+  // Because of the precision check, we can often rule out overflows.
+  bool NeedsOverflowCheck = true;
+  // Try to conservatively rule out overflow based on the already done precision
+  // checks.
+  unsigned OverflowMaxOutputBits = OpsFromSigned ? 2 : 1;
+  unsigned OverflowMaxCurBits =
+      std::max(NumUsedLeadingBits[0], NumUsedLeadingBits[1]);
+  bool OutputSigned = OpsFromSigned;
+  switch (BO.getOpcode()) {
+  case Instruction::FAdd:
+    IntOpc = Instruction::Add;
+    OverflowMaxOutputBits += OverflowMaxCurBits;
+    break;
+  case Instruction::FSub:
+    IntOpc = Instruction::Sub;
+    OverflowMaxOutputBits += OverflowMaxCurBits;
+    break;
+  case Instruction::FMul:
+    IntOpc = Instruction::Mul;
+    OverflowMaxOutputBits += OverflowMaxCurBits * 2;
+    break;
+  default:
+    llvm_unreachable("Unsupported binop");
+  }
+  // The precision check may have already ruled out overflow.
+  if (OverflowMaxOutputBits < IntSz) {
+    NeedsOverflowCheck = false;
+    // We can bound unsigned overflow from sub to in range signed value (this is
+    // what allows us to avoid the overflow check for sub).
+    if (IntOpc == Instruction::Sub)
+      OutputSigned = true;
+  }
+
+  // Precision check did not rule out overflow, so need to check.
+  if (NeedsOverflowCheck &&
+      !willNotOverflow(IntOpc, IntOps[0], IntOps[1], BO, OutputSigned))
+    return nullptr;
+
+  Value *IntBinOp = Builder.CreateBinOp(IntOpc, IntOps[0], IntOps[1]);
+  if (OutputSigned)
+    return new SIToFPInst(IntBinOp, FPTy);
+  return new UIToFPInst(IntBinOp, FPTy);
+}
+
 /// A binop with a constant operand and a sign-extended boolean operand may be
 /// converted into a select of constants by applying the binary operation to
 /// the constant with the two possible values of the extended boolean (0 or -1).
diff --git a/llvm/test/Transforms/InstCombine/add-sitofp.ll b/llvm/test/Transforms/InstCombine/add-sitofp.ll
index db44b806593b64..bb97e9002c29b8 100644
--- a/llvm/test/Transforms/InstCombine/add-sitofp.ll
+++ b/llvm/test/Transforms/InstCombine/add-sitofp.ll
@@ -5,8 +5,8 @@ define double @x(i32 %a, i32 %b) {
 ; CHECK-LABEL: @x(
 ; CHECK-NEXT:    [[M:%.*]] = lshr i32 [[A:%.*]], 24
 ; CHECK-NEXT:    [[N:%.*]] = and i32 [[M]], [[B:%.*]]
-; CHECK-NEXT:    [[ADDCONV:%.*]] = add nuw nsw i32 [[N]], 1
-; CHECK-NEXT:    [[P:%.*]] = sitofp i32 [[ADDCONV]] to double
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw i32 [[N]], 1
+; CHECK-NEXT:    [[P:%.*]] = sitofp i32 [[TMP1]] to double
 ; CHECK-NEXT:    ret double [[P]]
 ;
   %m = lshr i32 %a, 24
@@ -19,8 +19,8 @@ define double @x(i32 %a, i32 %b) {
 define double @test(i32 %a) {
 ; CHECK-LABEL: @test(
 ; CHECK-NEXT:    [[A_AND:%.*]] = and i32 [[A:%.*]], 1073741823
-; CHECK-NEXT:    [[ADDCONV:%.*]] = add nuw nsw i32 [[A_AND]], 1
-; CHECK-NEXT:    [[RES:%.*]] = sitofp i32 [[ADDCONV]] to double
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw i32 [[A_AND]], 1
+; CHECK-NEXT:    [[RES:%.*]] = sitofp i32 [[TMP1]] to double
 ; CHECK-NEXT:    ret double [[RES]]
 ;
   ; Drop two highest bits to guarantee that %a + 1 doesn't overflow
@@ -48,8 +48,8 @@ define double @test_2(i32 %a, i32 %b) {
 ; CHECK-LABEL: @test_2(
 ; CHECK-NEXT:    [[A_AND:%.*]] = and i32 [[A:%.*]], 1073741823
 ; CHECK-NEXT:    [[B_AND:%.*]] = and i32 [[B:%.*]], 1073741823
-; CHECK-NEXT:    [[ADDCONV:%.*]] = add nuw nsw i32 [[A_AND]], [[B_AND]]
-; CHECK-NEXT:    [[RES:%.*]] = sitofp i32 [[ADDCONV]] to double
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw i32 [[A_AND]], [[B_AND]]
+; CHECK-NEXT:    [[RES:%.*]] = sitofp i32 [[TMP1]] to double
 ; CHECK-NEXT:    ret double [[RES]]
 ;
   ; Drop two highest bits to guarantee that %a + %b doesn't overflow
@@ -90,8 +90,8 @@ define float @test_3(i32 %a, i32 %b) {
 ; CHECK-LABEL: @test_3(
 ; CHECK-NEXT:    [[M:%.*]] = lshr i32 [[A:%.*]], 24
 ; CHECK-NEXT:    [[N:%.*]] = and i32 [[M]], [[B:%.*]]
-; CHECK-NEXT:    [[O:%.*]] = sitofp i32 [[N]] to float
-; CHECK-NEXT:    [[P:%.*]] = fadd float [[O]], 1.000000e+00
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw i32 [[N]], 1
+; CHECK-NEXT:    [[P:%.*]] = sitofp i32 [[TMP1]] to float
 ; CHECK-NEXT:    ret float [[P]]
 ;
   %m = lshr i32 %a, 24
@@ -105,8 +105,8 @@ define <4 x double> @test_4(<4 x i32> %a, <4 x i32> %b) {
 ; CHECK-LABEL: @test_4(
 ; CHECK-NEXT:    [[A_AND:%.*]] = and <4 x i32> [[A:%.*]], <i32 1073741823, i32 1073741823, i32 1073741823, i32 1073741823>
 ; CHECK-NEXT:    [[B_AND:%.*]] = and <4 x i32> [[B:%.*]], <i32 1073741823, i32 1073741823, i32 1073741823, i32 1073741823>
-; CHECK-NEXT:    [[ADDCONV:%.*]] = add nuw nsw <4 x i32> [[A_AND]], [[B_AND]]
-; CHECK-NEXT:    [[RES:%.*]] = sitofp <4 x i32> [[ADDCONV]] to <4 x double>
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw <4 x i32> [[A_AND]], [[B_AND]]
+; CHECK-NEXT:    [[RES:%.*]] = sitofp <4 x i32> [[TMP1]] to <4 x double>
 ; CHECK-NEXT:    ret <4 x double> [[RES]]
 ;
   ; Drop two highest bits to guarantee that %a + %b doesn't overflow
diff --git a/llvm/test/Transforms/InstCombine/binop-itofp.ll b/llvm/test/Transforms/InstCombine/binop-itofp.ll
new file mode 100644
index 00000000000000..ffa89374579145
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/binop-itofp.ll
@@ -0,0 +1,998 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define half @test_ui_ui_i8_add(i8 noundef %x_in, i8 noundef %y_in) {
+; CHECK-LABEL: @test_ui_ui_i8_add(
+; CHECK-NEXT:    [[X:%.*]] = and i8 [[X_IN:%.*]], 127
+; CHECK-NEXT:    [[Y:%.*]] = and i8 [[Y_IN:%.*]], 127
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[R:%.*]] = uitofp i8 [[TMP1]] to half
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 127
+  %y = and i8 %y_in, 127
+  %xf = uitofp i8 %x to half
+  %yf = uitofp i8 %y to half
+  %r = fadd half %xf, %yf
+  ret half %r
+}
+
+define half @test_ui_ui_i8_add_fail_overflow(i8 noundef %x_in, i8 noundef %y_in) {
+; CHECK-LABEL: @test_ui_ui_i8_add_fail_overflow(
+; CHECK-NEXT:    [[X:%.*]] = and i8 [[X_IN:%.*]], 127
+; CHECK-NEXT:    [[Y:%.*]] = and i8 [[Y_IN:%.*]], -127
+; CHECK-NEXT:    [[XF:%.*]] = uitofp i8 [[X]] to half
+; CHECK-NEXT:    [[YF:%.*]] = uitofp i8 [[Y]] to half
+; CHECK-NEXT:    [[R:%.*]] = fadd half [[XF]], [[YF]]
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 127
+  %y = and i8 %y_in, 129
+  %xf = uitofp i8 %x to half
+  %yf = uitofp i8 %y to half
+  %r = fadd half %xf, %yf
+  ret half %r
+}
+
+define half @test_ui_ui_i8_add_C(i8 noundef %x_in) {
+; CHECK-LABEL: @test_ui_ui_i8_add_C(
+; CHECK-NEXT:    [[TMP1:%.*]] = or i8 [[X_IN:%.*]], -128
+; CHECK-NEXT:    [[R:%.*]] = uitofp i8 [[TMP1]] to half
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 127
+  %xf = uitofp i8 %x to half
+  %r = fadd half %xf, 128.0
+  ret half %r
+}
+
+define half @test_ui_ui_i8_add_C_fail_no_repr(i8 noundef %x_in) {
+; CHECK-LABEL: @test_ui_ui_i8_add_C_fail_no_repr(
+; CHECK-NEXT:    [[X:%.*]] = and i8 [[X_IN:%.*]], 127
+; CHECK-NEXT:    [[XF:%.*]] = uitofp i8 [[X]] to half
+; CHECK-NEXT:    [[R:%.*]] = fadd half [[XF]], 0xH57F8
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 127
+  %xf = uitofp i8 %x to half
+  %r = fadd half %xf, 127.5
+  ret half %r
+}
+
+define half @test_ui_ui_i8_add_C_fail_overflow(i8 noundef %x_in) {
+; CHECK-LABEL: @test_ui_ui_i8_add_C_fail_overflow(
+; CHECK-NEXT:    [[X:%.*]] = and i8 [[X_IN:%.*]], 127
+; CHECK-NEXT:    [[XF:%.*]] = uitofp i8 [[X]] to half
+; CHECK-NEXT:    [[R:%.*]] = fadd half [[XF]], 0xH5808
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 127
+  %xf = uitofp i8 %x to half
+  %r = fadd half %xf, 129.0
+  ret half %r
+}
+
+define half @test_si_si_i8_add(i8 noundef %x_in, i8 noundef %y_in) {
+; CHECK-LABEL: @test_si_si_i8_add(
+; CHECK-NEXT:    [[X:%.*]] = or i8 [[X_IN:%.*]], -64
+; CHECK-NEXT:    [[Y:%.*]] = or i8 [[Y_IN:%.*]], -64
+; CHECK-NEXT:    [[TMP1:%.*]] = add nsw i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[R:%.*]] = sitofp i8 [[TMP1]] to half
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = or i8 %x_in, -64
+  %y = or i8 %y_in, -64
+  %xf = sitofp i8 %x to half
+  %yf = sitofp i8 %y to half
+  %r = fadd half %xf, %yf
+  ret half %r
+}
+
+define half @test_si_si_i8_add_fail_overflow(i8 noundef %x_in, i8 noundef %y_in) {
+; CHECK-LABEL: @test_si_si_i8_add_fail_overflow(
+; CHECK-NEXT:    [[X:%.*]] = or i8 [[X_IN:%.*]], -64
+; CHECK-NEXT:    [[Y:%.*]] = or i8 [[Y_IN:%.*]], -65
+; CHECK-NEXT:    [[XF:%.*]] = sitofp i8 [[X]] to half
+; CHECK-NEXT:    [[YF:%.*]] = sitofp i8 [[Y]] to half
+; CHECK-NEXT:    [[R:%.*]] = fadd half [[XF]], [[YF]]
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = or i8 %x_in, -64
+  %y = or i8 %y_in, -65
+  %xf = sitofp i8 %x to half
+  %yf = sitofp i8 %y to half
+  %r = fadd half %xf, %yf
+  ret half %r
+}
+
+define half @test_ui_si_i8_add(i8 noundef %x_in, i8 noundef %y_in) {
+; CHECK-LABEL: @test_ui_si_i8_add(
+; CHECK-NEXT:    [[X:%.*]] = and i8 [[X_IN:%.*]], 63
+; CHECK-NEXT:    [[Y:%.*]] = and i8 [[Y_IN:%.*]], 63
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[R:%.*]] = sitofp i8 [[TMP1]] to half
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 63
+  %y = and i8 %y_in, 63
+  %xf = sitofp i8 %x to half
+  %yf = uitofp i8 %y to half
+  %r = fadd half %xf, %yf
+  ret half %r
+}
+
+define half @test_ui_si_i8_add_overflow(i8 noundef %x_in, i8 noundef %y_in) {
+; CHECK-LABEL: @test_ui_si_i8_add_overflow(
+; CHECK-NEXT:    [[X:%.*]] = and i8 [[X_IN:%.*]], 63
+; CHECK-NEXT:    [[Y:%.*]] = and i8 [[Y_IN:%.*]], 65
+; CHECK-NEXT:    [[XF:%.*]] = sitofp i8 [[X]] to half
+; CHECK-NEXT:    [[YF:%.*]] = uitofp i8 [[Y]] to half
+; CHECK-NEXT:    [[R:%.*]] = fadd half [[XF]], [[YF]]
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = and i8 %x_in, 63
+  %y = and i8 %y_in, 65
+  %xf = sitofp i8 %x to half
+  %yf = uitofp i8 %y to half
+  %r = fadd half %xf, %yf
+  ret half %r
+}
+
+define half @test_ui_ui_i8_sub_C(i8 noundef %x_in) {
+; CHECK-LABEL: @test_ui_ui_i8_sub_C(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i8 [[X_IN:%.*]], 127
+; CHECK-NEXT:    [[R:%.*]] = uitofp i8 [[TMP1]] to half
+; CHECK-NEXT:    ret half [[R]]
+;
+  %x = or i8 %x_in, 128
+  %xf = uitofp i8 %x to half
+  %r = fsub half %xf, 128.0
+  ret half %r
+}
+
+define half @test_ui_ui_i8_sub_C_fail_overflow(i8 noundef %x_in)...
[truncated]

goldsteinn · 2024-02-22T00:17:48Z

Result of all the proofs run locally:

; $> /home/noah/programs/opensource/llvm-dev/src/alive2/build/alive-tv (-smt-to=200000000)

----------------------------------------
define half @src_sisi_add_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = sadd_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = sitofp i8 noundef %x to half
  %yf = sitofp i8 noundef %y to half
  %r = fadd half %xf, %yf
  ret half %r
}
=>
define half @tgt_sisi_add_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = sadd_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = add i8 noundef %x, noundef %y
  %r = sitofp i8 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_uiui_add_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = uadd_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = uitofp i8 noundef %x to half
  %yf = uitofp i8 noundef %y to half
  %r = fadd half %xf, %yf
  ret half %r
}
=>
define half @tgt_uiui_add_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = uadd_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = add i8 noundef %x, noundef %y
  %r = uitofp i8 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_sisi_sub_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = ssub_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = sitofp i8 noundef %x to half
  %yf = sitofp i8 noundef %y to half
  %r = fsub half %xf, %yf
  ret half %r
}
=>
define half @tgt_sisi_sub_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = ssub_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = sub i8 noundef %x, noundef %y
  %r = sitofp i8 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_uiui_sub_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = usub_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = uitofp i8 noundef %x to half
  %yf = uitofp i8 noundef %y to half
  %r = fsub half %xf, %yf
  ret half %r
}
=>
define half @tgt_uiui_sub_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = usub_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = sub i8 noundef %x, noundef %y
  %r = uitofp i8 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_sisi_mul_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %x_okay = icmp ne i8 noundef %x, 0
  assume i1 %x_okay
  %y_okay = icmp ne i8 noundef %y, 0
  assume i1 %y_okay
  %overflow_info = smul_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = sitofp i8 noundef %x to half
  %yf = sitofp i8 noundef %y to half
  %r = fmul half %xf, %yf
  ret half %r
}
=>
define half @tgt_sisi_mul_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %x_okay = icmp ne i8 noundef %x, 0
  assume i1 %x_okay
  %y_okay = icmp ne i8 noundef %y, 0
  assume i1 %y_okay
  %overflow_info = smul_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = mul i8 noundef %x, noundef %y
  %r = sitofp i8 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_uiui_mul_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = umul_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = uitofp i8 noundef %x to half
  %yf = uitofp i8 noundef %y to half
  %r = fmul half %xf, %yf
  ret half %r
}
=>
define half @tgt_uiui_mul_i8(i8 noundef %x, i8 noundef %y) {
#0:
  %overflow_info = umul_overflow i8 noundef %x, noundef %y
  %does_overflow = extractvalue {i8, i1} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = mul i8 noundef %x, noundef %y
  %r = uitofp i8 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_sisi_add_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_high_zeros = icmp ule i16 noundef %x, 2048
  %x_high_ones = icmp uge i16 noundef %x, 63488
  %x_okay = or i1 %x_high_zeros, %x_high_ones
  assume i1 %x_okay
  %y_high_zeros = icmp ule i16 noundef %y, 2048
  %y_high_ones = icmp uge i16 noundef %y, 63488
  %y_okay = or i1 %y_high_zeros, %y_high_ones
  assume i1 %y_okay
  %xf = sitofp i16 noundef %x to half
  %yf = sitofp i16 noundef %y to half
  %r = fadd half %xf, %yf
  ret half %r
}
=>
define half @tgt_sisi_add_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_high_zeros = icmp ule i16 noundef %x, 2048
  %x_high_ones = icmp uge i16 noundef %x, 63488
  %x_okay = or i1 %x_high_zeros, %x_high_ones
  assume i1 %x_okay
  %y_high_zeros = icmp ule i16 noundef %y, 2048
  %y_high_ones = icmp uge i16 noundef %y, 63488
  %y_okay = or i1 %y_high_zeros, %y_high_ones
  assume i1 %y_okay
  %xy = add i16 noundef %x, noundef %y
  %r = sitofp i16 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_uiui_add_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_okay = icmp ule i16 noundef %x, 2048
  assume i1 %x_okay
  %y_okay = icmp ule i16 noundef %y, 2048
  assume i1 %y_okay
  %xf = uitofp i16 noundef %x to half
  %yf = uitofp i16 noundef %y to half
  %r = fadd half %xf, %yf
  ret half %r
}
=>
define half @tgt_uiui_add_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_okay = icmp ule i16 noundef %x, 2048
  assume i1 %x_okay
  %y_okay = icmp ule i16 noundef %y, 2048
  assume i1 %y_okay
  %xy = add i16 noundef %x, noundef %y
  %r = uitofp i16 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_sisi_sub_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_high_zeros = icmp ule i16 noundef %x, 2048
  %x_high_ones = icmp uge i16 noundef %x, 63488
  %x_okay = or i1 %x_high_zeros, %x_high_ones
  assume i1 %x_okay
  %y_high_zeros = icmp ule i16 noundef %y, 2048
  %y_high_ones = icmp uge i16 noundef %y, 63488
  %y_okay = or i1 %y_high_zeros, %y_high_ones
  assume i1 %y_okay
  %xf = sitofp i16 noundef %x to half
  %yf = sitofp i16 noundef %y to half
  %r = fsub half %xf, %yf
  ret half %r
}
=>
define half @tgt_sisi_sub_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_high_zeros = icmp ule i16 noundef %x, 2048
  %x_high_ones = icmp uge i16 noundef %x, 63488
  %x_okay = or i1 %x_high_zeros, %x_high_ones
  assume i1 %x_okay
  %y_high_zeros = icmp ule i16 noundef %y, 2048
  %y_high_ones = icmp uge i16 noundef %y, 63488
  %y_okay = or i1 %y_high_zeros, %y_high_ones
  assume i1 %y_okay
  %xy = sub i16 noundef %x, noundef %y
  %r = sitofp i16 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_uiui_sub_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_okay = icmp ule i16 noundef %x, 2048
  assume i1 %x_okay
  %y_okay = icmp ule i16 noundef %y, 2048
  assume i1 %y_okay
  %xf = uitofp i16 noundef %x to half
  %yf = uitofp i16 noundef %y to half
  %r = fsub half %xf, %yf
  ret half %r
}
=>
define half @tgt_uiui_sub_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_okay = icmp ule i16 noundef %x, 2048
  assume i1 %x_okay
  %y_okay = icmp ule i16 noundef %y, 2048
  assume i1 %y_okay
  %xy = sub i16 noundef %x, noundef %y
  %r = sitofp i16 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_sisi_mul_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_high_zeros = icmp ule i16 noundef %x, 2048
  %x_high_ones = icmp uge i16 noundef %x, 63488
  %x_non_zero = icmp ne i16 noundef %x, 0
  %x_okay0 = or i1 %x_high_zeros, %x_high_ones
  %x_okay = and i1 %x_okay0, %x_non_zero
  assume i1 %x_okay
  %y_high_zeros = icmp ule i16 noundef %y, 2048
  %y_high_ones = icmp uge i16 noundef %y, 63488
  %y_non_zero = icmp ne i16 noundef %y, 0
  %y_okay0 = or i1 %y_high_zeros, %y_high_ones
  %y_okay = and i1 %y_okay0, %y_non_zero
  assume i1 %y_okay
  %overflow_info = smul_overflow i16 noundef %x, noundef %y
  %does_overflow = extractvalue {i16, i1, i8} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = sitofp i16 noundef %x to half
  %yf = sitofp i16 noundef %y to half
  %r = fmul half %xf, %yf
  ret half %r
}
=>
define half @tgt_sisi_mul_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_high_zeros = icmp ule i16 noundef %x, 2048
  %x_high_ones = icmp uge i16 noundef %x, 63488
  %x_non_zero = icmp ne i16 noundef %x, 0
  %x_okay0 = or i1 %x_high_zeros, %x_high_ones
  %x_okay = and i1 %x_okay0, %x_non_zero
  assume i1 %x_okay
  %y_high_zeros = icmp ule i16 noundef %y, 2048
  %y_high_ones = icmp uge i16 noundef %y, 63488
  %y_non_zero = icmp ne i16 noundef %y, 0
  %y_okay0 = or i1 %y_high_zeros, %y_high_ones
  %y_okay = and i1 %y_okay0, %y_non_zero
  assume i1 %y_okay
  %overflow_info = smul_overflow i16 noundef %x, noundef %y
  %does_overflow = extractvalue {i16, i1, i8} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = mul i16 noundef %x, noundef %y
  %r = sitofp i16 %xy to half
  ret half %r
}
Transformation seems to be correct!


----------------------------------------
define half @src_uiui_mul_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_okay = icmp ule i16 noundef %x, 2048
  assume i1 %x_okay
  %y_okay = icmp ule i16 noundef %y, 2048
  assume i1 %y_okay
  %overflow_info = umul_overflow i16 noundef %x, noundef %y
  %does_overflow = extractvalue {i16, i1, i8} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xf = uitofp i16 noundef %x to half
  %yf = uitofp i16 noundef %y to half
  %r = fmul half %xf, %yf
  ret half %r
}
=>
define half @tgt_uiui_mul_i16(i16 noundef %x, i16 noundef %y) {
#0:
  %x_okay = icmp ule i16 noundef %x, 2048
  assume i1 %x_okay
  %y_okay = icmp ule i16 noundef %y, 2048
  assume i1 %y_okay
  %overflow_info = umul_overflow i16 noundef %x, noundef %y
  %does_overflow = extractvalue {i16, i1, i8} %overflow_info, 1
  %doesnot_overflow = xor i1 %does_overflow, 1
  assume i1 %doesnot_overflow
  %xy = mul i16 noundef %x, noundef %y
  %r = uitofp i16 %xy to half
  ret half %r
}
Transformation seems to be correct!

Summary:
  12 correct transformations
  0 incorrect transformations
  0 failed-to-prove transformations
  0 Alive2 errors

github-actions · 2024-02-22T00:19:46Z

✅ With the latest revision this PR passed the C/C++ code formatter.

PR Link: llvm/llvm-project#82555

llvm/test/Transforms/InstCombine/add-sitofp.ll

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

dtcxzyw · 2024-02-27T17:21:20Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+  unsigned NumUsedLeadingBits[2] = {IntSz, IntSz};
+
+  auto IsNonZero = [&](unsigned OpNo) -> bool {
+    if (OpsKnown[OpNo].hasKnownBits() &&


Suggested change

if (OpsKnown[OpNo].hasKnownBits() &&

if (

Don't think we want to do this. We may unnecessarily compute knownbits. The idea isKnownNonZero is what we want to spend our compute on, we only check knownbits if we have it already as an early out. Same applies to nonneg.

dtcxzyw · 2024-02-27T17:21:30Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+  };
+
+  auto IsNonNeg = [&](unsigned OpNo) -> bool {
+    if (OpsKnown[OpNo].hasKnownBits() &&


Suggested change

if (OpsKnown[OpNo].hasKnownBits() &&

if (

dtcxzyw · 2024-02-27T17:24:24Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+    // Otherwise if its signed cast check that fp precisions >= bitwidth(op) -
+    // numSignBits(op).
+    else if (OpsFromSigned)
+      NumUsedLeadingBits[OpNo] = IntSz - ComputeNumSignBits(IntOps[OpNo]);


Can we pass WithCache to ComputeNumSignBits/ComputeMaxSignificantBits to reuse the KnownBits result?

I started working on a patch for that, but it ends up spanning across alot of value tracking. Would rather leave as TODO for now. I can add comment that it should be changed if/as we expand usage of the WithCache in ValueTracking. (overflow checks also have same issue).

That okay?

The big issue I was running into, is we need to start caching Depth and adding logic for if we want to override what we have cached (or potentially introduce regressions).
It was brought up in the original commit: #66611 (comment)
but nikic was opposed. Maybe we can revisit in the future if there is a proper compile time advantage.

arsenm

can you also test/prove the binop (sitofp), (uitofp) cases

arsenm · 2024-02-28T06:04:25Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+    return (OpsFromSigned && BO.getOpcode() == Instruction::FMul)
+               ? IsNonZero(OpNo)
+               : true;


OpsFromSigned && Opcode == fmul && IsNonZero(OpNo)

Hmm, thats not the same.

It could be: !OpsFromSigned || Opcode != fmul || isNonZero(OpNo). Would you prefer that?

Basically we only need the IsNonZero for sitofp + fmul. Otherwise we are good.

Yes, select of bool to bool value is a weird pattern

goldsteinn · 2024-02-28T06:07:12Z

can you also test/prove the binop (sitofp), (uitofp) cases

The binop (sitofp), (uitofp) as we handle them are a subset of the existing proofs.
We only handle different signed cases if we convert one to the either (i.e proof one of
the ops is non-neg: (uitofp nneg x) == (sitofp nneg x)).

arsenm · 2024-02-29T11:29:34Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+  unsigned MaxRepresentableBits =
+      APFloat::semanticsPrecision(FPTy->getScalarType()->getFltSemantics());


Off by 1 depending on signed or unsigned?

Yes, we are overly conservative if IntSz == MaxRepresentableBits + 1 and its signed (not we can't expand the bound for say sitofp i16 to half b.c it won't sign extend). So for things like i12, i24, ...
I'll add comment to the affect, although don't think its really worth increase code complexity to handle a special case that will almost never actually apply (weird integer widths are rare).
Proofs:

; $> /home/noah/programs/opensource/llvm-dev/src/alive2/build/alive-tv (-smt-to=200000000) ---------------------------------------- define half @src_sisi_add_i12(i12 noundef %x, i12 noundef %y) { #0: %overflow_info = sadd_overflow i12 noundef %x, noundef %y %does_overflow = extractvalue {i12, i1, i8} %overflow_info, 1 %doesnot_overflow = xor i1 %does_overflow, 1 assume i1 %doesnot_overflow %xf = sitofp i12 noundef %x to half %yf = sitofp i12 noundef %y to half %r = fadd half %xf, %yf ret half %r } => define half @tgt_sisi_add_i12(i12 noundef %x, i12 noundef %y) { #0: %overflow_info = sadd_overflow i12 noundef %x, noundef %y %does_overflow = extractvalue {i12, i1, i8} %overflow_info, 1 %doesnot_overflow = xor i1 %does_overflow, 1 assume i1 %doesnot_overflow %xy = add i12 noundef %x, noundef %y %r = sitofp i12 %xy to half ret half %r } Transformation seems to be correct! ---------------------------------------- define half @src_uiui_add_i12(i12 noundef %x, i12 noundef %y) { #0: %overflow_info = uadd_overflow i12 noundef %x, noundef %y %does_overflow = extractvalue {i12, i1, i8} %overflow_info, 1 %doesnot_overflow = xor i1 %does_overflow, 1 assume i1 %doesnot_overflow %xf = uitofp i12 noundef %x to half %yf = uitofp i12 noundef %y to half %r = fadd half %xf, %yf ret half %r } => define half @tgt_uiui_add_i12(i12 noundef %x, i12 noundef %y) { #0: %overflow_info = uadd_overflow i12 noundef %x, noundef %y %does_overflow = extractvalue {i12, i1, i8} %overflow_info, 1 %doesnot_overflow = xor i1 %does_overflow, 1 assume i1 %doesnot_overflow %xy = add i12 noundef %x, noundef %y %r = uitofp i12 %xy to half ret half %r } Transformation doesn't verify! ERROR: Value mismatch Example: i12 noundef %x = #x80d (2061, -2035) i12 noundef %y = #x002 (2) Source: {i12, i1, i8} %overflow_info = { #x80f (2063, -2033), #x0 (0), poison } i1 %does_overflow = #x0 (0) i1 %doesnot_overflow = #x1 (1) half %xf = #x6806 (2060) half %yf = #x4000 (2) half %r = #x6807 (2062) Target: {i12, i1, i8} %overflow_info = { #x80f (2063, -2033), #x0 (0), poison } i1 %does_overflow = #x0 (0) i1 %doesnot_overflow = #x1 (1) i12 %xy = #x80f (2063, -2033) half %r = #x6808 (2064) Source value: #x6807 (2062) Target value: #x6808 (2064) Summary: 1 correct transformations 1 incorrect transformations 0 failed-to-prove transformations 0 Alive2 errors

Didn't add the comment here?

Added it at the check vs IntSz in isValidPromotion.

goldsteinn · 2024-03-04T22:31:29Z

ping

arsenm · 2024-03-05T09:59:06Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+    // handled specially. We can't, however, increase the bound arbitrarily for
+    // `sitofp` as for larger sizes, it won't sign extend.
+    if (MaxRepresentableBits >= IntSz)
+      ;


Empty statement weird here; invert conditions?

…itofp y))`; NFC

…tructionCombiner; NFC

The full fold is one of the following: 1) `(fp_binop ({s|u}itofp x), ({s|u}itofp y))` -> `({s|u}itofp (int_binop x, y))` 2) `(fp_binop ({s|u}itofp x), FpC)` -> `({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))` And support the following binops: `fmul` -> `mul` `fadd` -> `add` `fsub` -> `sub` Proofs: https://alive2.llvm.org/ce/z/zuacA8 The proofs timeout, so they must be reproduced locally.

arsenm · 2024-03-06T13:27:35Z

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

+  unsigned MaxRepresentableBits =
+      APFloat::semanticsPrecision(FPTy->getScalarType()->getFltSemantics());


Didn't add the comment here?

sunshaoce · 2024-03-14T06:36:23Z

This commit 946ea4e3ca4c908bfa7c196b982795f5c390b923 caused the following program to produce inconsistent results:

#include <stdio.h>
static int g_12 = 0xD010L;
static float g_842 = 1.0;
static volatile int g_2345 = 1;

static float(safe_mul_func_float_f_f)(float sf1, float sf2) {
  return (((0x1.0p-100f * sf1) * (0x1.0p-28f * sf2)) >
          (0x1.0p-100f * (0x1.0p-28f * 3.40282347e+38F)))
             ? (sf1)
             : (sf1 * sf2);
}

static void func_10(int p_11) {
  g_842 = safe_mul_func_float_f_f(0.0, (short)p_11);
  g_12 = 65529UL;
  g_2345 = p_11;
}

int main(int argc, char *argv[]) {
  func_10(g_12);
  printf("%f\n", g_842);
  return 0;
}

Before: -0.000000
After: 0.000000

This is my compilation option:

build/bin/clang --target=riscv64-unknown-elf -march=rv64gcv -mabi=lp64d src/foo-main.c --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${GCC_TOOLCHAIN}/riscv64-unknown-elf -O3
qemu-riscv64 a.out

safe_mul_func_float_f_f I removed the call to fabsf in order to eliminate the dependency on math.h.

goldsteinn · 2024-03-14T18:31:31Z

This commit 946ea4e3ca4c908bfa7c196b982795f5c390b923 caused the following program to produce inconsistent results:

#include <stdio.h>
static int g_12 = 0xD010L;
static float g_842 = 1.0;
static volatile int g_2345 = 1;

static float(safe_mul_func_float_f_f)(float sf1, float sf2) {
  return (((0x1.0p-100f * sf1) * (0x1.0p-28f * sf2)) >
          (0x1.0p-100f * (0x1.0p-28f * 3.40282347e+38F)))
             ? (sf1)
             : (sf1 * sf2);
}

static void func_10(int p_11) {
  g_842 = safe_mul_func_float_f_f(0.0, (short)p_11);
  g_12 = 65529UL;
  g_2345 = p_11;
}

int main(int argc, char *argv[]) {
  func_10(g_12);
  printf("%f\n", g_842);
  return 0;
}

Before: -0.000000 After: 0.000000

This is my compilation option:

build/bin/clang --target=riscv64-unknown-elf -march=rv64gcv -mabi=lp64d src/foo-main.c --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${GCC_TOOLCHAIN}/riscv64-unknown-elf -O3
qemu-riscv64 a.out

safe_mul_func_float_f_f I removed the call to fabsf in order to eliminate the dependency on math.h.

Reproduce, I think the issue is we don't check constants are positive for sitofp + fmul.
If thats the issue, ill have patch up shortly. Otherwise Ill revert.
(A bit reluctant to revert as there are 3 patches that have to go with it, but if can't find fix
shortly will do so).

goldsteinn · 2024-03-14T19:00:05Z

This commit 946ea4e3ca4c908bfa7c196b982795f5c390b923 caused the following program to produce inconsistent results:
#include <stdio.h>
static int g_12 = 0xD010L;
static float g_842 = 1.0;
static volatile int g_2345 = 1;

static float(safe_mul_func_float_f_f)(float sf1, float sf2) {
  return (((0x1.0p-100f * sf1) * (0x1.0p-28f * sf2)) >
          (0x1.0p-100f * (0x1.0p-28f * 3.40282347e+38F)))
             ? (sf1)
             : (sf1 * sf2);
}

static void func_10(int p_11) {
  g_842 = safe_mul_func_float_f_f(0.0, (short)p_11);
  g_12 = 65529UL;
  g_2345 = p_11;
}

int main(int argc, char *argv[]) {
  func_10(g_12);
  printf("%f\n", g_842);
  return 0;
}
Before: -0.000000 After: 0.000000
This is my compilation option:
build/bin/clang --target=riscv64-unknown-elf -march=rv64gcv -mabi=lp64d src/foo-main.c --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${GCC_TOOLCHAIN}/riscv64-unknown-elf -O3
qemu-riscv64 a.out
safe_mul_func_float_f_f I removed the call to fabsf in order to eliminate the dependency on math.h.
Reproduce, I think the issue is we don't check constants are ~~positive~~ non-zero for sitofp + fmul. If thats the issue, ill have patch up shortly. Otherwise Ill revert. (A bit reluctant to revert as there are 3 patches that have to go with it, but if can't find fix shortly will do so).

Yeah adding a non-zero check for constant fixes. Will have patch up soon.

Bug was introduced in llvm#82555 We where missing check that the constant was non-zero for signed + mul transform.

goldsteinn · 2024-03-14T19:27:35Z

This commit 946ea4e3ca4c908bfa7c196b982795f5c390b923 caused the following program to produce inconsistent results:
#include <stdio.h>
static int g_12 = 0xD010L;
static float g_842 = 1.0;
static volatile int g_2345 = 1;

static float(safe_mul_func_float_f_f)(float sf1, float sf2) {
  return (((0x1.0p-100f * sf1) * (0x1.0p-28f * sf2)) >
          (0x1.0p-100f * (0x1.0p-28f * 3.40282347e+38F)))
             ? (sf1)
             : (sf1 * sf2);
}

static void func_10(int p_11) {
  g_842 = safe_mul_func_float_f_f(0.0, (short)p_11);
  g_12 = 65529UL;
  g_2345 = p_11;
}

int main(int argc, char *argv[]) {
  func_10(g_12);
  printf("%f\n", g_842);
  return 0;
}
Before: -0.000000 After: 0.000000
This is my compilation option:
build/bin/clang --target=riscv64-unknown-elf -march=rv64gcv -mabi=lp64d src/foo-main.c --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${GCC_TOOLCHAIN}/riscv64-unknown-elf -O3
qemu-riscv64 a.out
safe_mul_func_float_f_f I removed the call to fabsf in order to eliminate the dependency on math.h.
Reproduce, I think the issue is we don't check constants are ~~positive~~ non-zero for sitofp + fmul. If thats the issue, ill have patch up shortly. Otherwise Ill revert. (A bit reluctant to revert as there are 3 patches that have to go with it, but if can't find fix shortly will do so).
Yeah adding a non-zero check for constant fixes. Will have patch up soon.

Fix is up at: #85298

Bug was introduced in llvm#82555 We where missing check that the constant was non-zero for signed + mul transform.

Bug was introduced in #82555 We where missing check that the constant was non-zero for signed + mul transform. Closes #85298

goldsteinn · 2024-03-14T22:42:12Z

This commit 946ea4e3ca4c908bfa7c196b982795f5c390b923 caused the following program to produce inconsistent results:
#include <stdio.h>
static int g_12 = 0xD010L;
static float g_842 = 1.0;
static volatile int g_2345 = 1;

static float(safe_mul_func_float_f_f)(float sf1, float sf2) {
  return (((0x1.0p-100f * sf1) * (0x1.0p-28f * sf2)) >
          (0x1.0p-100f * (0x1.0p-28f * 3.40282347e+38F)))
             ? (sf1)
             : (sf1 * sf2);
}

static void func_10(int p_11) {
  g_842 = safe_mul_func_float_f_f(0.0, (short)p_11);
  g_12 = 65529UL;
  g_2345 = p_11;
}

int main(int argc, char *argv[]) {
  func_10(g_12);
  printf("%f\n", g_842);
  return 0;
}
Before: -0.000000 After: 0.000000
This is my compilation option:
build/bin/clang --target=riscv64-unknown-elf -march=rv64gcv -mabi=lp64d src/foo-main.c --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${GCC_TOOLCHAIN}/riscv64-unknown-elf -O3
qemu-riscv64 a.out
safe_mul_func_float_f_f I removed the call to fabsf in order to eliminate the dependency on math.h.
Reproduce, I think the issue is we don't check constants are ~~positive~~ non-zero for sitofp + fmul. If thats the issue, ill have patch up shortly. Otherwise Ill revert. (A bit reluctant to revert as there are 3 patches that have to go with it, but if can't find fix shortly will do so).
Yeah adding a non-zero check for constant fixes. Will have patch up soon.
Fix is up at: #85298

Fix pushed.

We already handle the `+x` case, and noticed it was missing in the bug affecting llvm#82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV

We already handle the `+x` case, and noticed it was missing in the bug affecting #82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes #85345

We already handle the `+x` case, and noticed it was missing in the bug affecting llvm#82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes llvm#85345

goldsteinn requested a review from nikic as a code owner February 22, 2024 00:16

llvmbot added the llvm:transforms label Feb 22, 2024

goldsteinn requested a review from dtcxzyw February 22, 2024 00:16

goldsteinn changed the title ~~goldsteinn/fp binop folds~~ [InstCombine] Add folds for (fp_binop ({s|u}itofp x), ({s|u}itofp y)) Feb 22, 2024

goldsteinn mentioned this pull request Feb 22, 2024

[InstCombine] Canonicalize (sitofp x) -> (uitofp x) if x >= 0 #82404

Closed

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Feb 22, 2024

pre-commit: test PR82555

0774c9a

PR Link: llvm/llvm-project#82555

dtcxzyw mentioned this pull request Feb 22, 2024

pre-commit: test PR82555 dtcxzyw/llvm-opt-benchmark#255

Closed

dtcxzyw reviewed Feb 22, 2024

View reviewed changes

goldsteinn force-pushed the goldsteinn/fp-binop-folds branch from 50df541 to d04e70f Compare February 22, 2024 16:33

dtcxzyw requested a review from arsenm February 27, 2024 17:20

dtcxzyw reviewed Feb 27, 2024

View reviewed changes

goldsteinn force-pushed the goldsteinn/fp-binop-folds branch from d04e70f to b974bff Compare February 27, 2024 18:18

arsenm reviewed Feb 28, 2024

View reviewed changes

arsenm reviewed Feb 29, 2024

View reviewed changes

goldsteinn force-pushed the goldsteinn/fp-binop-folds branch from b974bff to 498f112 Compare February 29, 2024 18:19

arsenm reviewed Mar 5, 2024

View reviewed changes

goldsteinn added 3 commits March 5, 2024 13:31

[InstCombine] Add tests for folding `(fp_binop ({s|u}itofp x), ({s|u}…

5191b03

…itofp y))`; NFC

[InstCombine] Move folding (add (sitofp x), (sitofp y)) impl to Ins…

63bb759

…tructionCombiner; NFC

goldsteinn force-pushed the goldsteinn/fp-binop-folds branch from 498f112 to 0e4d930 Compare March 5, 2024 19:38

arsenm approved these changes Mar 6, 2024

View reviewed changes

goldsteinn closed this in 946ea4e Mar 6, 2024

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Mar 14, 2024

[InstCombine] Fix behavior for (fmul (sitfp x), 0)

65dacea

Bug was introduced in llvm#82555 We where missing check that the constant was non-zero for signed + mul transform.

goldsteinn mentioned this pull request Mar 14, 2024

[InstCombine] Fix behavior for (fmul (sitfp x), 0) #85298

Closed

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Mar 14, 2024

[InstCombine] Fix behavior for (fmul (sitfp x), 0)

bf7aef8

Bug was introduced in llvm#82555 We where missing check that the constant was non-zero for signed + mul transform.

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Mar 14, 2024

[InstCombine] Fix behavior for (fmul (sitfp x), 0)

5a0e73d

Bug was introduced in llvm#82555 We where missing check that the constant was non-zero for signed + mul transform.

goldsteinn added a commit that referenced this pull request Mar 14, 2024

[InstCombine] Fix behavior for (fmul (sitfp x), 0)

70d0ebb

Bug was introduced in #82555 We where missing check that the constant was non-zero for signed + mul transform. Closes #85298

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Mar 15, 2024

[InstSimply] Simplify (fmul -x, +/-0) -> -/+0

6f53eaa

We already handle the `+x` case, and noticed it was missing in the bug affecting llvm#82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Mar 15, 2024

[InstSimply] Simplify (fmul -x, +/-0) -> -/+0

d84e96c

We already handle the `+x` case, and noticed it was missing in the bug affecting llvm#82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV

goldsteinn added a commit that referenced this pull request Mar 18, 2024

[InstSimply] Simplify (fmul -x, +/-0) -> -/+0

5265be1

We already handle the `+x` case, and noticed it was missing in the bug affecting #82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes #85345

		unsigned MaxRepresentableBits =
		APFloat::semanticsPrecision(FPTy->getScalarType()->getFltSemantics());

[InstCombine] Add folds for (fp_binop ({s|u}itofp x), ({s|u}itofp y)) #82555

[InstCombine] Add folds for (fp_binop ({s|u}itofp x), ({s|u}itofp y)) #82555

Uh oh!

Conversation

goldsteinn commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 22, 2024

Uh oh!

goldsteinn commented Feb 22, 2024

Uh oh!

github-actions bot commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

goldsteinn commented Feb 28, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

goldsteinn commented Mar 4, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sunshaoce commented Mar 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

goldsteinn commented Mar 14, 2024

Uh oh!

goldsteinn commented Mar 14, 2024

Uh oh!

goldsteinn commented Mar 14, 2024

Uh oh!

goldsteinn commented Mar 14, 2024

Uh oh!

Uh oh!

[InstCombine] Add folds for `(fp_binop ({s|u}itofp x), ({s|u}itofp y))` #82555

[InstCombine] Add folds for `(fp_binop ({s|u}itofp x), ({s|u}itofp y))` #82555

goldsteinn commented Feb 22, 2024 •

edited

Loading

github-actions bot commented Feb 22, 2024 •

edited

Loading

sunshaoce commented Mar 14, 2024 •

edited

Loading