-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstCombine] Fold (X * 0.0) * constant => X * 0.0 #85241 #92512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention the issue in the description
%f1 = fmul float %a, -0.000000 | ||
%f2 = fmul float %f1, -1.000000 | ||
ret float %f2 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need tests that show flag preservation behavior.
Also, since you're matching on any Constant, would be good to test an exotic and unrealistic constant expression. Something like float bitcast (i32 ptrtoint (ptr @something to i32) to float)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the code I got after using a fast flag:
define float @mul_neg_zero_pos_value_fmf(float %a) {
; CHECK-LABEL: @mul_neg_zero_pos_value_fmf(
; CHECK-NEXT: ret float 0.000000e+00
;
%f1 = fmul fast float %a, -0.000000
%f2 = fmul fast float %f1, 1.000000
ret float %f2
}
define float @mul_pos_zero_neg_value_fmf(float %a) {
; CHECK-LABEL: @mul_pos_zero_neg_value_fmf(
; CHECK-NEXT: ret float -0.000000e+00
;
%f1 = fmul fast float %a, -0.000000
%f2 = fmul fast float %f1, -1.000000
ret float %f2
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test with specific flags, not just fast. fast implies the flags that permitted this transform in the first place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are similar transforms in InstCombinerImpl::foldFMulReassoc
and they call ConstantFoldBinaryOpOperands
, which can fail, to ensure that the new RHS gets folded to a single constant. So I think you should probably do the same.
%f1 = fmul nnan float %a, -0.000000 | ||
%f2 = fmul nnan float %f1, -1.000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also test individual flag on one of the fmuls, and on the other. And different flags on each fmul
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
if (match(Op0, m_FMul(m_Value(X), m_Constant(C1))) && | ||
match(C1, m_AnyZeroFP()) && match(Op1, m_Constant(C))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The match of C1 to m_AnyZeroFP is redundant, you can just directly check m_AnyZeroFP where you first matched C1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_AnyZeroFP
doesn't capture so you can't write m_AnyZeroFP(C1)
. But it probably should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't point at a specific constant in the non-splat case. You should be able to just match m_AnyZeroFP and cast<Constant>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't point at a specific constant in the non-splat case.
I'd expect it to point at the whole vector-typed Constant
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, need to convert to not-draft review
define float @mul_pos_zero_neg_const_with_mixed_fmf_test2(float %a) { | ||
; CHECK-LABEL: @mul_pos_zero_neg_const_with_mixed_fmf_test2( | ||
; CHECK-NEXT: [[TMP1:%.*]] = fneg float [[A:%.*]] | ||
; CHECK-NEXT: [[F2:%.*]] = call float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ninf got lost, but I don't think this is an issue with this patch
@llvm/pr-subscribers-llvm-transforms Author: None (SahilPatidar) ChangesFix #85241 Full diff: https://github.com/llvm/llvm-project/pull/92512.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
index ca1b1921404d8..8d71ac8716fba 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -905,6 +905,16 @@ Instruction *InstCombinerImpl::visitFMul(BinaryOperator &I) {
}
}
+ // (X * 0.0) * constant => X * 0.0
+ if (match(Op0, m_FMul(m_Value(X), m_AnyZeroFP())) &&
+ match(Op1, m_Constant(C))) {
+ Constant *C1 = cast<Constant>(cast<Instruction>(Op0)->getOperand(1));
+ if (Constant *CC1 =
+ ConstantFoldBinaryOpOperands(Instruction::FMul, C, C1, DL)) {
+ return BinaryOperator::CreateFMulFMF(X, CC1, I.getFastMathFlags());
+ }
+ }
+
// Simplify FMUL recurrences starting with 0.0 to 0.0 if nnan and nsz are set.
// Given a phi node with entry value as 0 and it used in fmul operation,
// we can replace fmul with 0 safely and eleminate loop operation.
diff --git a/llvm/test/Transforms/InstCombine/fmul.ll b/llvm/test/Transforms/InstCombine/fmul.ll
index 1526956c5b241..e7f2fb6b41559 100644
--- a/llvm/test/Transforms/InstCombine/fmul.ll
+++ b/llvm/test/Transforms/InstCombine/fmul.ll
@@ -1407,3 +1407,250 @@ entry:
%ret = fmul <3 x float> %a, <float -0.0, float 0.0, float poison>
ret <3 x float> %ret
}
+
+define <2 x float> @mul_pos_zero_neg_const_vec(<2 x float> %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_vec(
+; CHECK-NEXT: [[F2:%.*]] = fmul <2 x float> [[A:%.*]], <float -0.000000e+00, float -0.000000e+00>
+; CHECK-NEXT: ret <2 x float> [[F2]]
+;
+ %f1 = fmul <2 x float> %a, <float 0.000000, float 0.000000>
+ %f2 = fmul <2 x float> %f1, <float -1.000000, float -1.000000>
+ ret <2 x float> %f2
+}
+
+define <2 x float> @mul_pos_zero_mixed_neg_const_vec(<2 x float> %a) {
+; CHECK-LABEL: @mul_pos_zero_mixed_neg_const_vec(
+; CHECK-NEXT: [[F2:%.*]] = fmul <2 x float> [[A:%.*]], <float 0.000000e+00, float -0.000000e+00>
+; CHECK-NEXT: ret <2 x float> [[F2]]
+;
+ %f1 = fmul <2 x float> %a, <float 0.000000, float 0.000000>
+ %f2 = fmul <2 x float> %f1, <float 1.000000, float -1.000000>
+ ret <2 x float> %f2
+}
+
+define <2 x float> @mul_neg_zero_mixed_const_vec(<2 x float> %a) {
+; CHECK-LABEL: @mul_neg_zero_mixed_const_vec(
+; CHECK-NEXT: [[F2:%.*]] = fmul <2 x float> [[A:%.*]], <float -0.000000e+00, float -0.000000e+00>
+; CHECK-NEXT: ret <2 x float> [[F2]]
+;
+ %f1 = fmul <2 x float> %a, <float -0.000000, float 0.000000>
+ %f2 = fmul <2 x float> %f1, <float 1.000000, float -1.000000>
+ ret <2 x float> %f2
+}
+
+define <2 x float> @mul_neg_zero_mixed_const_vec_ninf(<2 x float> %a) {
+; CHECK-LABEL: @mul_neg_zero_mixed_const_vec_ninf(
+; CHECK-NEXT: [[F2:%.*]] = fmul ninf <2 x float> [[A:%.*]], <float -0.000000e+00, float -0.000000e+00>
+; CHECK-NEXT: ret <2 x float> [[F2]]
+;
+ %f1 = fmul ninf <2 x float> %a, <float -0.000000, float 0.000000>
+ %f2 = fmul ninf <2 x float> %f1, <float 1.000000, float -1.000000>
+ ret <2 x float> %f2
+}
+
+define <2 x float> @mul_neg_zero_mixed_const_vec_nnan(<2 x float> %a) {
+; CHECK-LABEL: @mul_neg_zero_mixed_const_vec_nnan(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg nnan <2 x float> [[A:%.*]]
+; CHECK-NEXT: [[F2:%.*]] = call nnan <2 x float> @llvm.copysign.v2f32(<2 x float> zeroinitializer, <2 x float> [[TMP1]])
+; CHECK-NEXT: ret <2 x float> [[F2]]
+;
+ %f1 = fmul nnan <2 x float> %a, <float -0.000000, float 0.000000>
+ %f2 = fmul nnan <2 x float> %f1, <float 1.000000, float -1.000000>
+ ret <2 x float> %f2
+}
+
+define float @mul_pos_zero_neg_const(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const(
+; CHECK-NEXT: [[F2:%.*]] = fmul float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul float %a, 0.000000
+ %f2 = fmul float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_pos_const(float %a) {
+; CHECK-LABEL: @mul_neg_zero_pos_const(
+; CHECK-NEXT: [[F1:%.*]] = fmul float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F1]]
+;
+ %f1 = fmul float %a, -0.000000
+ %f2 = fmul float %f1, 1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_neg_const(float %a) {
+; CHECK-LABEL: @mul_neg_zero_neg_const(
+; CHECK-NEXT: [[F2:%.*]] = fmul float [[A:%.*]], 0.000000e+00
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul float %a, -0.000000
+ %f2 = fmul float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_pos_const_ninf(float %a) {
+; CHECK-LABEL: @mul_neg_zero_pos_const_ninf(
+; CHECK-NEXT: [[F1:%.*]] = fmul ninf float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F1]]
+;
+ %f1 = fmul ninf float %a, -0.000000
+ %f2 = fmul ninf float %f1, 1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_neg_const_ninf(float %a) {
+; CHECK-LABEL: @mul_neg_zero_neg_const_ninf(
+; CHECK-NEXT: [[F2:%.*]] = fmul ninf float [[A:%.*]], 0.000000e+00
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul ninf float %a, -0.000000
+ %f2 = fmul ninf float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_pos_zero_neg_const_ninf(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_ninf(
+; CHECK-NEXT: [[F2:%.*]] = fmul ninf float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul ninf float %a, 0.000000
+ %f2 = fmul ninf float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_pos_const_nnan(float %a) {
+; CHECK-LABEL: @mul_neg_zero_pos_const_nnan(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg nnan float [[A:%.*]]
+; CHECK-NEXT: [[F1:%.*]] = call nnan float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]])
+; CHECK-NEXT: ret float [[F1]]
+;
+ %f1 = fmul nnan float %a, -0.000000
+ %f2 = fmul nnan float %f1, 1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_neg_const_nnan(float %a) {
+; CHECK-LABEL: @mul_neg_zero_neg_const_nnan(
+; CHECK-NEXT: [[F2:%.*]] = call nnan float @llvm.copysign.f32(float 0.000000e+00, float [[A:%.*]])
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul nnan float %a, -0.000000
+ %f2 = fmul nnan float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_pos_zero_neg_const_nnan(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_nnan(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg nnan float [[A:%.*]]
+; CHECK-NEXT: [[F2:%.*]] = call nnan float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]])
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul nnan float %a, 0.000000
+ %f2 = fmul nnan float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_pos_zero_neg_const_nnan_res(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_nnan_res(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg nnan float [[A:%.*]]
+; CHECK-NEXT: [[F2:%.*]] = call nnan float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]])
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul float %a, 0.000000
+ %f2 = fmul nnan float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_neg_const_with_nnan_fmul_result(float %a) {
+; CHECK-LABEL: @mul_neg_const_with_nnan_fmul_result(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg float [[A:%.*]]
+; CHECK-NEXT: [[F2:%.*]] = call float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]])
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul nnan float %a, 0.000000
+ %f2 = fmul float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_pos_zero_neg_const_ninf_res(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_ninf_res(
+; CHECK-NEXT: [[F2:%.*]] = fmul ninf float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul float %a, 0.000000
+ %f2 = fmul ninf float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_neg_const_with_ninf_fmul_result(float %a) {
+; CHECK-LABEL: @mul_neg_const_with_ninf_fmul_result(
+; CHECK-NEXT: [[F2:%.*]] = fmul float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul ninf float %a, 0.000000
+ %f2 = fmul float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_pos_zero_neg_const_with_mixed_fmf_test1(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_with_mixed_fmf_test1(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg nnan float [[A:%.*]]
+; CHECK-NEXT: [[F2:%.*]] = call nnan float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]])
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul ninf float %a, 0.000000
+ %f2 = fmul nnan float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_pos_zero_neg_const_with_mixed_fmf_test2(float %a) {
+; CHECK-LABEL: @mul_pos_zero_neg_const_with_mixed_fmf_test2(
+; CHECK-NEXT: [[TMP1:%.*]] = fneg float [[A:%.*]]
+; CHECK-NEXT: [[F2:%.*]] = call float @llvm.copysign.f32(float 0.000000e+00, float [[TMP1]])
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul nnan float %a, 0.000000
+ %f2 = fmul ninf float %f1, -1.000000
+ ret float %f2
+}
+
+define float @mul_neg_zero_const_expr(float %a) {
+; CHECK-LABEL: @mul_neg_zero_const_expr(
+; CHECK-NEXT: [[F3:%.*]] = fmul float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: ret float [[F3]]
+;
+ %f1 = fmul float %a, -0.000000e+00
+ %i1 = inttoptr i32 1000 to ptr
+ %i = ptrtoint ptr %i1 to i32
+ %f2 = bitcast i32 %i to float
+ %f3 = fmul float %f1, %f2
+ ret float %f3
+}
+
+define float @mul_neg_zero_expr(float %a, ptr %b) {
+; CHECK-LABEL: @mul_neg_zero_expr(
+; CHECK-NEXT: [[F1:%.*]] = fmul float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: [[TMP1:%.*]] = ptrtoint ptr [[B:%.*]] to i64
+; CHECK-NEXT: [[I:%.*]] = trunc i64 [[TMP1]] to i32
+; CHECK-NEXT: [[F2:%.*]] = bitcast i32 [[I]] to float
+; CHECK-NEXT: [[F3:%.*]] = fmul float [[F1]], [[F2]]
+; CHECK-NEXT: ret float [[F3]]
+;
+ %f1 = fmul float %a, -0.000000e+00
+ %i = ptrtoint ptr %b to i32
+ %f2 = bitcast i32 %i to float
+ %f3 = fmul float %f1, %f2
+ ret float %f3
+}
+
+define float @mul_neg_zero_expr2(float %a, ptr %b) {
+; CHECK-LABEL: @mul_neg_zero_expr2(
+; CHECK-NEXT: [[F1:%.*]] = fmul float [[A:%.*]], -0.000000e+00
+; CHECK-NEXT: [[F2:%.*]] = fmul float [[F1]], bitcast (i32 ptrtoint (ptr getelementptr inbounds ({ [2 x ptr] }, ptr @g, i64 1, i32 0, i64 0) to i32) to float)
+; CHECK-NEXT: ret float [[F2]]
+;
+ %f1 = fmul float %a, -0.000000e+00
+ %f2 = fmul float %f1, bitcast (i32 ptrtoint (ptr getelementptr inbounds ({ [2 x ptr] }, ptr @g, i64 0, i32 0, i64 2) to i32) to float)
+ ret float %f2
+}
|
Constant *C1 = cast<Constant>(cast<Instruction>(Op0)->getOperand(1)); | ||
if (Constant *CC1 = | ||
ConstantFoldBinaryOpOperands(Instruction::FMul, C, C1, DL)) { | ||
return BinaryOperator::CreateFMulFMF(X, CC1, I.getFastMathFlags()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ninf
cannot be propagated: https://alive2.llvm.org/ce/z/s-p5Jv
Please add alive2 proofs for FMF preservation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think it would be safer to only propagate flags that are present on both of the original fmul instructions. But I agree that you should check all cases with alive2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still propagating ninf
even though I used return BinaryOperator::CreateFMul(X, CC1);
. I don't understand if it is happening internally during the BinaryOperator
creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flags shouldn't just appear out of nowhere from CreateFMul.
Most straightforward should be I.getFastMathFlags() & cast<FPMathOperator>(Op0)->getFastMathFlags()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I doing something wrong?
FastMathFlags FMF = I.getFastMathFlags() &
cast<FPMathOperator>(Op0)->getFastMathFlags();
return BinaryOperator::CreateFMulFMF(X, CC1, FMF);
still propagating ninf
flag.
define float @mul_pos_zero_neg_const_ninf_res(float %a) {
; CHECK-LABEL: @mul_pos_zero_neg_const_ninf_res(
; CHECK-NEXT: [[F2:%.*]] = fmul ninf float [[A:%.*]], -0.000000e+00
; CHECK-NEXT: ret float [[F2]]
;
%f1 = fmul float %a, 0.000000
%f2 = fmul ninf float %f1, -1.000000
ret float %f2
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried looking into this, and this combine isn't even firing. This is coming from somewhere else. This combine seems to be duplicated, somewhere, less precisely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://alive2.llvm.org/ce/z/yYnLEi this is a separate combine that's breaking the flag handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://alive2.llvm.org/ce/z/MHShQY longer version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to fix #93769 before this
@@ -905,6 +905,16 @@ Instruction *InstCombinerImpl::visitFMul(BinaryOperator &I) { | |||
} | |||
} | |||
|
|||
// (X * 0.0) * constant => X * 0.0 | |||
if (match(Op0, m_FMul(m_Value(X), m_AnyZeroFP())) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this up, and combine with the existing handling of x * 0 -> copysign(0, x)
Constant *C1 = cast<Constant>(cast<Instruction>(Op0)->getOperand(1)); | ||
if (Constant *CC1 = | ||
ConstantFoldBinaryOpOperands(Instruction::FMul, C, C1, DL)) { | ||
return BinaryOperator::CreateFMulFMF(X, CC1, I.getFastMathFlags()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried looking into this, and this combine isn't even firing. This is coming from somewhere else. This combine seems to be duplicated, somewhere, less precisely
Reverse ping. |
Fix #85241
Alive2: https://alive2.llvm.org/ce/z/yZFMTY