-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAG] canCreateUndefOrPoison – mark fneg/fadd/fsub/fmul/fdiv/frem as not poison generating #142345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-amdgpu Author: Harrison Hao (harrisonGPU) ChangesAfter revisiting the LLVM Language Reference Manual, it is confirmed that Poison generation in FP instructions occurs only when specific fast-math References:
Full diff: https://github.com/llvm/llvm-project/pull/142345.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 1506bc4ee187d..279c7daf71c33 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5579,6 +5579,12 @@ bool SelectionDAG::canCreateUndefOrPoison(SDValue Op, const APInt &DemandedElts,
case ISD::ADD:
case ISD::SUB:
case ISD::MUL:
+ case ISD::FNEG:
+ case ISD::FADD:
+ case ISD::FSUB:
+ case ISD::FMUL:
+ case ISD::FDIV:
+ case ISD::FREM:
// No poison except from flags (which is handled above)
return false;
diff --git a/llvm/test/CodeGen/AMDGPU/freeze-binary.ll b/llvm/test/CodeGen/AMDGPU/freeze-binary.ll
new file mode 100644
index 0000000000000..4321cedcc8b96
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/freeze-binary.ll
@@ -0,0 +1,282 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs < %s | FileCheck %s -check-prefix GFX11
+define float @freeze_fadd(float %input) nounwind {
+; GFX11-LABEL: freeze_fadd:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_add_f32_e32 v0, 2.0, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fadd reassoc nsz arcp contract afn float %input, 1.000000e+00
+ %y = freeze float %x
+ %z = fadd reassoc nsz arcp contract afn float %y, 1.000000e+00
+ ret float %z
+}
+
+define float @freeze_fadd_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fadd_nnan:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_add_f32_e32 v0, 1.0, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_add_f32_e32 v0, 1.0, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fadd nnan contract float %input, 1.000000e+00
+ %y = freeze float %x
+ %z = fadd nnan contract float %y, 1.000000e+00
+ ret float %z
+}
+
+define <4 x float> @freeze_fadd_vec(<4 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fadd_vec:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_dual_add_f32 v0, 0x40a00000, v0 :: v_dual_add_f32 v1, 0x40a00000, v1
+; GFX11-NEXT: v_dual_add_f32 v2, 0x40a00000, v2 :: v_dual_add_f32 v3, 0x40a00000, v3
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fadd reassoc nsz arcp contract afn <4 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+ %y = freeze <4 x float> %x
+ %z = fadd reassoc nsz arcp contract afn <4 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+ ret <4 x float> %z
+}
+
+define float @freeze_fsub(float %input) nounwind {
+; GFX11-LABEL: freeze_fsub:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fsub reassoc nsz arcp contract afn float %input, 1.000000e+00
+ %y = freeze float %x
+ %z = fsub reassoc nsz arcp contract afn float %y, 1.000000e+00
+ ret float %z
+}
+
+define float @freeze_fsub_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fsub_nnan:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fsub nnan contract float %input, 1.000000e+00
+ %y = freeze float %x
+ %z = fsub nnan contract float %y, 1.000000e+00
+ ret float %z
+}
+
+define <4 x float> @freeze_fsub_vec(<4 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fsub_vec:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_dual_add_f32 v0, 0xc0a00000, v0 :: v_dual_add_f32 v1, 0xc0a00000, v1
+; GFX11-NEXT: v_dual_add_f32 v2, 0xc0a00000, v2 :: v_dual_add_f32 v3, 0xc0a00000, v3
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fsub reassoc nsz arcp contract afn <4 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+ %y = freeze <4 x float> %x
+ %z = fsub reassoc nsz arcp contract afn <4 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+ ret <4 x float> %z
+}
+
+define float @freeze_fmul(float %input) nounwind {
+; GFX11-LABEL: freeze_fmul:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_mul_f32_e32 v0, 4.0, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fmul reassoc nsz arcp contract afn float %input, 2.000000e+00
+ %y = freeze float %x
+ %z = fmul reassoc nsz arcp contract afn float %y, 2.000000e+00
+ ret float %z
+}
+
+define float @freeze_fmul_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fmul_nnan:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_add_f32_e32 v0, v0, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_add_f32_e32 v0, v0, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fmul nnan contract float %input, 2.000000e+00
+ %y = freeze float %x
+ %z = fmul nnan contract float %y, 2.000000e+00
+ ret float %z
+}
+
+define <8 x float> @freeze_fmul_vec(<8 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fmul_vec:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_dual_mul_f32 v0, 4.0, v0 :: v_dual_mul_f32 v1, 0x40c00000, v1
+; GFX11-NEXT: v_dual_mul_f32 v2, 0x40c00000, v2 :: v_dual_mul_f32 v3, 4.0, v3
+; GFX11-NEXT: v_dual_mul_f32 v4, 4.0, v4 :: v_dual_mul_f32 v5, 0x40c00000, v5
+; GFX11-NEXT: v_dual_mul_f32 v6, 0x40c00000, v6 :: v_dual_mul_f32 v7, 4.0, v7
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fmul reassoc nsz arcp contract afn <8 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+ %y = freeze <8 x float> %x
+ %z = fmul reassoc nsz arcp contract afn <8 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+ ret <8 x float> %z
+}
+
+define float @freeze_fdiv(float %input) nounwind {
+; GFX11-LABEL: freeze_fdiv:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_mul_f32_e32 v0, 0x3e800000, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fdiv reassoc nsz arcp contract afn float %input, 2.000000e+00
+ %y = freeze float %x
+ %z = fdiv reassoc nsz arcp contract afn float %y, 2.000000e+00
+ ret float %z
+}
+
+define float @freeze_fdiv_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fdiv_nnan:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_mul_f32_e32 v0, 0.5, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_mul_f32_e32 v0, 0.5, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fdiv nnan contract float %input, 2.000000e+00
+ %y = freeze float %x
+ %z = fdiv nnan contract float %y, 2.000000e+00
+ ret float %z
+}
+
+define <8 x float> @freeze_fdiv_vec(<8 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fdiv_vec:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_dual_mul_f32 v3, 0x3e800000, v3 :: v_dual_mul_f32 v4, 0x3e800000, v4
+; GFX11-NEXT: v_dual_mul_f32 v0, 0x3e800000, v0 :: v_dual_mul_f32 v7, 0x3e800000, v7
+; GFX11-NEXT: v_dual_mul_f32 v1, 0x3e2aaaab, v1 :: v_dual_mul_f32 v2, 0x3e2aaaab, v2
+; GFX11-NEXT: v_dual_mul_f32 v5, 0x3e2aaaab, v5 :: v_dual_mul_f32 v6, 0x3e2aaaab, v6
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = fdiv reassoc nsz arcp contract afn <8 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+ %y = freeze <8 x float> %x
+ %z = fdiv reassoc nsz arcp contract afn <8 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+ ret <8 x float> %z
+}
+
+define float @freeze_frem(float %input) nounwind {
+; GFX11-LABEL: freeze_frem:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_mul_f32_e32 v1, 0.5, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_trunc_f32_e32 v1, v1
+; GFX11-NEXT: v_fmac_f32_e32 v0, -2.0, v1
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_mul_f32_e32 v1, 0.5, v0
+; GFX11-NEXT: v_trunc_f32_e32 v1, v1
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_fmac_f32_e32 v0, -2.0, v1
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = frem reassoc nsz arcp contract afn float %input, 2.000000e+00
+ %y = freeze float %x
+ %z = frem reassoc nsz arcp contract afn float %y, 2.000000e+00
+ ret float %z
+}
+
+define float @freeze_frem_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_frem_nnan:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_mul_f32_e32 v1, 0.5, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_trunc_f32_e32 v1, v1
+; GFX11-NEXT: v_fma_f32 v1, -2.0, v1, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_bfi_b32 v1, 0x7fffffff, v1, v0
+; GFX11-NEXT: v_mul_f32_e32 v2, 0.5, v1
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_trunc_f32_e32 v2, v2
+; GFX11-NEXT: v_fmac_f32_e32 v1, -2.0, v2
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_bfi_b32 v0, 0x7fffffff, v1, v0
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = frem nnan contract float %input, 2.000000e+00
+ %y = freeze float %x
+ %z = frem nnan contract float %y, 2.000000e+00
+ ret float %z
+}
+
+define <8 x float> @freeze_frem_vec(<8 x float> %input) nounwind {
+; GFX11-LABEL: freeze_frem_vec:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_dual_mul_f32 v8, 0x3e800000, v4 :: v_dual_mul_f32 v9, 0x3e800000, v3
+; GFX11-NEXT: v_trunc_f32_e32 v11, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT: v_trunc_f32_e32 v8, v8
+; GFX11-NEXT: v_trunc_f32_e32 v9, v9
+; GFX11-NEXT: v_mul_f32_e32 v10, 0.5, v6
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_4) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT: v_dual_sub_f32 v0, v0, v11 :: v_dual_mul_f32 v11, 0x3eaaaaab, v5
+; GFX11-NEXT: v_dual_fmac_f32 v4, -4.0, v8 :: v_dual_fmac_f32 v3, -4.0, v9
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_3) | instskip(SKIP_1) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_trunc_f32_e32 v10, v10
+; GFX11-NEXT: v_trunc_f32_e32 v9, v7
+; GFX11-NEXT: v_dual_fmac_f32 v6, -2.0, v10 :: v_dual_sub_f32 v7, v7, v9
+; GFX11-NEXT: v_mul_f32_e32 v8, 0.5, v1
+; GFX11-NEXT: v_trunc_f32_e32 v9, v11
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT: v_mul_f32_e32 v11, 0x3e800000, v7
+; GFX11-NEXT: v_trunc_f32_e32 v8, v8
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_4)
+; GFX11-NEXT: v_fmac_f32_e32 v1, -2.0, v8
+; GFX11-NEXT: v_fmac_f32_e32 v5, 0xc0400000, v9
+; GFX11-NEXT: v_mul_f32_e32 v10, 0x3eaaaaab, v2
+; GFX11-NEXT: v_mul_f32_e32 v12, 0x3e800000, v0
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_2)
+; GFX11-NEXT: v_trunc_f32_e32 v8, v10
+; GFX11-NEXT: v_trunc_f32_e32 v10, v12
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(SKIP_2) | instid1(VALU_DEP_4)
+; GFX11-NEXT: v_fmac_f32_e32 v2, 0xc0400000, v8
+; GFX11-NEXT: v_trunc_f32_e32 v8, v11
+; GFX11-NEXT: v_mul_f32_e32 v12, 0x3eaaaaab, v1
+; GFX11-NEXT: v_dual_fmac_f32 v0, -4.0, v10 :: v_dual_mul_f32 v11, 0.5, v5
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT: v_fmac_f32_e32 v7, -4.0, v8
+; GFX11-NEXT: v_trunc_f32_e32 v9, v12
+; GFX11-NEXT: v_mul_f32_e32 v12, 0x3eaaaaab, v6
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(SKIP_2) | instid1(VALU_DEP_2)
+; GFX11-NEXT: v_fmac_f32_e32 v1, 0xc0400000, v9
+; GFX11-NEXT: v_trunc_f32_e32 v9, v11
+; GFX11-NEXT: v_trunc_f32_e32 v11, v3
+; GFX11-NEXT: v_dual_mul_f32 v10, 0.5, v2 :: v_dual_fmac_f32 v5, -2.0, v9
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_2)
+; GFX11-NEXT: v_sub_f32_e32 v3, v3, v11
+; GFX11-NEXT: v_trunc_f32_e32 v8, v10
+; GFX11-NEXT: v_trunc_f32_e32 v10, v12
+; GFX11-NEXT: v_trunc_f32_e32 v12, v4
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT: v_fmac_f32_e32 v2, -2.0, v8
+; GFX11-NEXT: v_fmac_f32_e32 v6, 0xc0400000, v10
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_3)
+; GFX11-NEXT: v_sub_f32_e32 v4, v4, v12
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %x = frem reassoc nsz arcp contract afn <8 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+ %y = freeze <8 x float> %x
+ %z = frem reassoc nsz arcp contract afn <8 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+ ret <8 x float> %z
+}
|
When I implemented this PR: #142250, we found this problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks right to me.
entry: | ||
%x = fadd nnan contract float %input, 1.000000e+00 | ||
%y = freeze float %x | ||
%z = fadd nnan contract float %y, 1.000000e+00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also have reassoc? I think otherwise the transform would not happen even if nnan was missing, so the test doesn't quite should what you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! Initially, I intended to test the case where Op->hasPoisonGeneratingFlags()
returns true, so I used the nnan flag as an example:
llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Lines 5578 to 5579 in e3a0cb8
if (ConsiderFlags && Op->hasPoisonGeneratingFlags()) | |
return true; |
However, upon further inspection, I realize this isn't necessary. When we actually call
canCreateUndefOrPoison
in this context, we explicitly set ConsiderFlags
to false, as shown in the following snippet:
if (DAG.canCreateUndefOrPoison(N0, /*PoisonOnly*/ false,
/*ConsiderFlags*/ false) ||
N0->getNumValues() != 1 || !N0->hasOneUse())
return SDValue();
Thus, even if I set the nnan
flag, it won't affect this particular test case. Therefore, I've removed the unnecessary tests involving nnan. Additionally, I previously forgot to include tests for fneg, which I have now added.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fneg tests?
Thank, I have added it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test name should be more specific
define <8 x float> @freeze_fneg_vec(<8 x float> %input) nounwind { | ||
; CHECK-LABEL: freeze_fneg_vec: | ||
; CHECK: ; %bb.0: | ||
; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) | ||
; CHECK-NEXT: s_setpc_b64 s[30:31] | ||
%x = fneg <8 x float> %input | ||
%y = freeze <8 x float> %x | ||
%z = fneg <8 x float> %y | ||
ret <8 x float> %z | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the different vector type tests aren't that useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have removed it.
Thanks, I followed the naming convention from llvm/test/CodeGen/X86/freeze-binary.ll, could you please give me some suggestions?
Thanks, I followed the naming convention from llvm/test/CodeGen/X86/freeze-binary.ll, could you please give me some suggestions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks |
…not poison generating (llvm#142345) After revisiting the LLVM Language Reference Manual, it is confirmed that plain floating-point operations (`fneg`, `fadd`, `fsub`, `fmul`, `fdiv`, and `frem`) propagate poison but do not inherently create new poison values. Thus, `SelectionDAG::canCreateUndefOrPoison` should return `false` for these operations by default. Poison generation in FP instructions occurs only when specific fast-math flags (`nnan`, `ninf`, or the collective fast) are present, as these flags explicitly convert NaN or Inf results into poison. References: - [`fneg` instruction documentation](https://llvm.org/docs/LangRef.html#fneg-instruction) - [`fadd` instruction documentation](https://llvm.org/docs/LangRef.html#fadd-instruction) - [`fsub` instruction documentation](https://llvm.org/docs/LangRef.html#fsub-instruction) - [`fmul` instruction documentation](https://llvm.org/docs/LangRef.html#fmul-instruction) - [`fdiv` instruction documentation](https://llvm.org/docs/LangRef.html#fdiv-instruction) - [`frem` instruction documentation](https://llvm.org/docs/LangRef.html#frem-instruction) - [Fast-Math Flags documentation](https://llvm.org/docs/LangRef.html#fast-math-flags)
…not poison generating (llvm#142345) After revisiting the LLVM Language Reference Manual, it is confirmed that plain floating-point operations (`fneg`, `fadd`, `fsub`, `fmul`, `fdiv`, and `frem`) propagate poison but do not inherently create new poison values. Thus, `SelectionDAG::canCreateUndefOrPoison` should return `false` for these operations by default. Poison generation in FP instructions occurs only when specific fast-math flags (`nnan`, `ninf`, or the collective fast) are present, as these flags explicitly convert NaN or Inf results into poison. References: - [`fneg` instruction documentation](https://llvm.org/docs/LangRef.html#fneg-instruction) - [`fadd` instruction documentation](https://llvm.org/docs/LangRef.html#fadd-instruction) - [`fsub` instruction documentation](https://llvm.org/docs/LangRef.html#fsub-instruction) - [`fmul` instruction documentation](https://llvm.org/docs/LangRef.html#fmul-instruction) - [`fdiv` instruction documentation](https://llvm.org/docs/LangRef.html#fdiv-instruction) - [`frem` instruction documentation](https://llvm.org/docs/LangRef.html#frem-instruction) - [Fast-Math Flags documentation](https://llvm.org/docs/LangRef.html#fast-math-flags)
After revisiting the LLVM Language Reference Manual, it is confirmed that
plain floating-point operations (
fneg
,fadd
,fsub
,fmul
,fdiv
, andfrem
)propagate poison but do not inherently create new poison values. Thus,
SelectionDAG::canCreateUndefOrPoison
should returnfalse
for theseoperations by default.
Poison generation in FP instructions occurs only when specific fast-math
flags (
nnan
,ninf
, or the collective fast) are present, as these flagsexplicitly convert NaN or Inf results into poison.
References:
fneg
instruction documentationfadd
instruction documentationfsub
instruction documentationfmul
instruction documentationfdiv
instruction documentationfrem
instruction documentation