Skip to content

[DAG] canCreateUndefOrPoison – mark fneg/fadd/fsub/fmul/fdiv/frem as not poison generating #142345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 3, 2025

Conversation

harrisonGPU
Copy link
Contributor

After revisiting the LLVM Language Reference Manual, it is confirmed that
plain floating-point operations (fneg, fadd, fsub, fmul, fdiv, and frem)
propagate poison but do not inherently create new poison values. Thus,
SelectionDAG::canCreateUndefOrPoison should return false for these
operations by default.

Poison generation in FP instructions occurs only when specific fast-math
flags (nnan, ninf, or the collective fast) are present, as these flags
explicitly convert NaN or Inf results into poison.

References:

@llvmbot llvmbot added backend:AMDGPU llvm:SelectionDAG SelectionDAGISel as well labels Jun 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 2, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-amdgpu

Author: Harrison Hao (harrisonGPU)

Changes

After revisiting the LLVM Language Reference Manual, it is confirmed that
plain floating-point operations (fneg, fadd, fsub, fmul, fdiv, and frem)
propagate poison but do not inherently create new poison values. Thus,
SelectionDAG::canCreateUndefOrPoison should return false for these
operations by default.

Poison generation in FP instructions occurs only when specific fast-math
flags (nnan, ninf, or the collective fast) are present, as these flags
explicitly convert NaN or Inf results into poison.

References:


Full diff: https://github.com/llvm/llvm-project/pull/142345.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+6)
  • (added) llvm/test/CodeGen/AMDGPU/freeze-binary.ll (+282)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 1506bc4ee187d..279c7daf71c33 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5579,6 +5579,12 @@ bool SelectionDAG::canCreateUndefOrPoison(SDValue Op, const APInt &DemandedElts,
   case ISD::ADD:
   case ISD::SUB:
   case ISD::MUL:
+  case ISD::FNEG:
+  case ISD::FADD:
+  case ISD::FSUB:
+  case ISD::FMUL:
+  case ISD::FDIV:
+  case ISD::FREM:
     // No poison except from flags (which is handled above)
     return false;
 
diff --git a/llvm/test/CodeGen/AMDGPU/freeze-binary.ll b/llvm/test/CodeGen/AMDGPU/freeze-binary.ll
new file mode 100644
index 0000000000000..4321cedcc8b96
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/freeze-binary.ll
@@ -0,0 +1,282 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs < %s | FileCheck %s -check-prefix GFX11
+define float @freeze_fadd(float %input) nounwind {
+; GFX11-LABEL: freeze_fadd:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_add_f32_e32 v0, 2.0, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fadd reassoc nsz arcp contract afn float %input, 1.000000e+00
+  %y = freeze float %x
+  %z = fadd reassoc nsz arcp contract afn float %y, 1.000000e+00
+  ret float %z
+}
+
+define float @freeze_fadd_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fadd_nnan:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_add_f32_e32 v0, 1.0, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_add_f32_e32 v0, 1.0, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fadd nnan contract float %input, 1.000000e+00
+  %y = freeze float %x
+  %z = fadd nnan contract float %y, 1.000000e+00
+  ret float %z
+}
+
+define <4 x float> @freeze_fadd_vec(<4 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fadd_vec:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_dual_add_f32 v0, 0x40a00000, v0 :: v_dual_add_f32 v1, 0x40a00000, v1
+; GFX11-NEXT:    v_dual_add_f32 v2, 0x40a00000, v2 :: v_dual_add_f32 v3, 0x40a00000, v3
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fadd reassoc nsz arcp contract afn <4 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+  %y = freeze <4 x float> %x
+  %z = fadd reassoc nsz arcp contract afn <4 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+  ret <4 x float> %z
+}
+
+define float @freeze_fsub(float %input) nounwind {
+; GFX11-LABEL: freeze_fsub:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fsub reassoc nsz arcp contract afn float %input, 1.000000e+00
+  %y = freeze float %x
+  %z = fsub reassoc nsz arcp contract afn float %y, 1.000000e+00
+  ret float %z
+}
+
+define float @freeze_fsub_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fsub_nnan:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_subrev_f32_e32 v0, 1.0, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fsub nnan contract float %input, 1.000000e+00
+  %y = freeze float %x
+  %z = fsub nnan contract float %y, 1.000000e+00
+  ret float %z
+}
+
+define <4 x float> @freeze_fsub_vec(<4 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fsub_vec:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_dual_add_f32 v0, 0xc0a00000, v0 :: v_dual_add_f32 v1, 0xc0a00000, v1
+; GFX11-NEXT:    v_dual_add_f32 v2, 0xc0a00000, v2 :: v_dual_add_f32 v3, 0xc0a00000, v3
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fsub reassoc nsz arcp contract afn <4 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+  %y = freeze <4 x float> %x
+  %z = fsub reassoc nsz arcp contract afn <4 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+  ret <4 x float> %z
+}
+
+define float @freeze_fmul(float %input) nounwind {
+; GFX11-LABEL: freeze_fmul:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_mul_f32_e32 v0, 4.0, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fmul reassoc nsz arcp contract afn float %input, 2.000000e+00
+  %y = freeze float %x
+  %z = fmul reassoc nsz arcp contract afn float %y, 2.000000e+00
+  ret float %z
+}
+
+define float @freeze_fmul_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fmul_nnan:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_add_f32_e32 v0, v0, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_add_f32_e32 v0, v0, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fmul nnan contract float %input, 2.000000e+00
+  %y = freeze float %x
+  %z = fmul nnan contract float %y, 2.000000e+00
+  ret float %z
+}
+
+define <8 x float> @freeze_fmul_vec(<8 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fmul_vec:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_dual_mul_f32 v0, 4.0, v0 :: v_dual_mul_f32 v1, 0x40c00000, v1
+; GFX11-NEXT:    v_dual_mul_f32 v2, 0x40c00000, v2 :: v_dual_mul_f32 v3, 4.0, v3
+; GFX11-NEXT:    v_dual_mul_f32 v4, 4.0, v4 :: v_dual_mul_f32 v5, 0x40c00000, v5
+; GFX11-NEXT:    v_dual_mul_f32 v6, 0x40c00000, v6 :: v_dual_mul_f32 v7, 4.0, v7
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fmul reassoc nsz arcp contract afn <8 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+  %y = freeze <8 x float> %x
+  %z = fmul reassoc nsz arcp contract afn <8 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+  ret <8 x float> %z
+}
+
+define float @freeze_fdiv(float %input) nounwind {
+; GFX11-LABEL: freeze_fdiv:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_mul_f32_e32 v0, 0x3e800000, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fdiv reassoc nsz arcp contract afn float %input, 2.000000e+00
+  %y = freeze float %x
+  %z = fdiv reassoc nsz arcp contract afn float %y, 2.000000e+00
+  ret float %z
+}
+
+define float @freeze_fdiv_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_fdiv_nnan:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_mul_f32_e32 v0, 0.5, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_mul_f32_e32 v0, 0.5, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fdiv nnan contract float %input, 2.000000e+00
+  %y = freeze float %x
+  %z = fdiv nnan contract float %y, 2.000000e+00
+  ret float %z
+}
+
+define <8 x float> @freeze_fdiv_vec(<8 x float> %input) nounwind {
+; GFX11-LABEL: freeze_fdiv_vec:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_dual_mul_f32 v3, 0x3e800000, v3 :: v_dual_mul_f32 v4, 0x3e800000, v4
+; GFX11-NEXT:    v_dual_mul_f32 v0, 0x3e800000, v0 :: v_dual_mul_f32 v7, 0x3e800000, v7
+; GFX11-NEXT:    v_dual_mul_f32 v1, 0x3e2aaaab, v1 :: v_dual_mul_f32 v2, 0x3e2aaaab, v2
+; GFX11-NEXT:    v_dual_mul_f32 v5, 0x3e2aaaab, v5 :: v_dual_mul_f32 v6, 0x3e2aaaab, v6
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = fdiv reassoc nsz arcp contract afn <8 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+  %y = freeze <8 x float> %x
+  %z = fdiv reassoc nsz arcp contract afn <8 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+  ret <8 x float> %z
+}
+
+define float @freeze_frem(float %input) nounwind {
+; GFX11-LABEL: freeze_frem:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_mul_f32_e32 v1, 0.5, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT:    v_trunc_f32_e32 v1, v1
+; GFX11-NEXT:    v_fmac_f32_e32 v0, -2.0, v1
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT:    v_mul_f32_e32 v1, 0.5, v0
+; GFX11-NEXT:    v_trunc_f32_e32 v1, v1
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_fmac_f32_e32 v0, -2.0, v1
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = frem reassoc nsz arcp contract afn float %input, 2.000000e+00
+  %y = freeze float %x
+  %z = frem reassoc nsz arcp contract afn float %y, 2.000000e+00
+  ret float %z
+}
+
+define float @freeze_frem_nnan(float %input) nounwind {
+; GFX11-LABEL: freeze_frem_nnan:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_mul_f32_e32 v1, 0.5, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT:    v_trunc_f32_e32 v1, v1
+; GFX11-NEXT:    v_fma_f32 v1, -2.0, v1, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT:    v_bfi_b32 v1, 0x7fffffff, v1, v0
+; GFX11-NEXT:    v_mul_f32_e32 v2, 0.5, v1
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT:    v_trunc_f32_e32 v2, v2
+; GFX11-NEXT:    v_fmac_f32_e32 v1, -2.0, v2
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:    v_bfi_b32 v0, 0x7fffffff, v1, v0
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = frem nnan contract float %input, 2.000000e+00
+  %y = freeze float %x
+  %z = frem nnan contract float %y, 2.000000e+00
+  ret float %z
+}
+
+define <8 x float> @freeze_frem_vec(<8 x float> %input) nounwind {
+; GFX11-LABEL: freeze_frem_vec:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:    v_dual_mul_f32 v8, 0x3e800000, v4 :: v_dual_mul_f32 v9, 0x3e800000, v3
+; GFX11-NEXT:    v_trunc_f32_e32 v11, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT:    v_trunc_f32_e32 v8, v8
+; GFX11-NEXT:    v_trunc_f32_e32 v9, v9
+; GFX11-NEXT:    v_mul_f32_e32 v10, 0.5, v6
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_4) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT:    v_dual_sub_f32 v0, v0, v11 :: v_dual_mul_f32 v11, 0x3eaaaaab, v5
+; GFX11-NEXT:    v_dual_fmac_f32 v4, -4.0, v8 :: v_dual_fmac_f32 v3, -4.0, v9
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_3) | instskip(SKIP_1) | instid1(VALU_DEP_1)
+; GFX11-NEXT:    v_trunc_f32_e32 v10, v10
+; GFX11-NEXT:    v_trunc_f32_e32 v9, v7
+; GFX11-NEXT:    v_dual_fmac_f32 v6, -2.0, v10 :: v_dual_sub_f32 v7, v7, v9
+; GFX11-NEXT:    v_mul_f32_e32 v8, 0.5, v1
+; GFX11-NEXT:    v_trunc_f32_e32 v9, v11
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT:    v_mul_f32_e32 v11, 0x3e800000, v7
+; GFX11-NEXT:    v_trunc_f32_e32 v8, v8
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_4)
+; GFX11-NEXT:    v_fmac_f32_e32 v1, -2.0, v8
+; GFX11-NEXT:    v_fmac_f32_e32 v5, 0xc0400000, v9
+; GFX11-NEXT:    v_mul_f32_e32 v10, 0x3eaaaaab, v2
+; GFX11-NEXT:    v_mul_f32_e32 v12, 0x3e800000, v0
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_2)
+; GFX11-NEXT:    v_trunc_f32_e32 v8, v10
+; GFX11-NEXT:    v_trunc_f32_e32 v10, v12
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(SKIP_2) | instid1(VALU_DEP_4)
+; GFX11-NEXT:    v_fmac_f32_e32 v2, 0xc0400000, v8
+; GFX11-NEXT:    v_trunc_f32_e32 v8, v11
+; GFX11-NEXT:    v_mul_f32_e32 v12, 0x3eaaaaab, v1
+; GFX11-NEXT:    v_dual_fmac_f32 v0, -4.0, v10 :: v_dual_mul_f32 v11, 0.5, v5
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT:    v_fmac_f32_e32 v7, -4.0, v8
+; GFX11-NEXT:    v_trunc_f32_e32 v9, v12
+; GFX11-NEXT:    v_mul_f32_e32 v12, 0x3eaaaaab, v6
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(SKIP_2) | instid1(VALU_DEP_2)
+; GFX11-NEXT:    v_fmac_f32_e32 v1, 0xc0400000, v9
+; GFX11-NEXT:    v_trunc_f32_e32 v9, v11
+; GFX11-NEXT:    v_trunc_f32_e32 v11, v3
+; GFX11-NEXT:    v_dual_mul_f32 v10, 0.5, v2 :: v_dual_fmac_f32 v5, -2.0, v9
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_2)
+; GFX11-NEXT:    v_sub_f32_e32 v3, v3, v11
+; GFX11-NEXT:    v_trunc_f32_e32 v8, v10
+; GFX11-NEXT:    v_trunc_f32_e32 v10, v12
+; GFX11-NEXT:    v_trunc_f32_e32 v12, v4
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_3)
+; GFX11-NEXT:    v_fmac_f32_e32 v2, -2.0, v8
+; GFX11-NEXT:    v_fmac_f32_e32 v6, 0xc0400000, v10
+; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_3)
+; GFX11-NEXT:    v_sub_f32_e32 v4, v4, v12
+; GFX11-NEXT:    s_setpc_b64 s[30:31]
+entry:
+  %x = frem reassoc nsz arcp contract afn <8 x float> %input, <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00, float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00>
+  %y = freeze <8 x float> %x
+  %z = frem reassoc nsz arcp contract afn <8 x float> %y, <float 4.000000e+00, float 3.000000e+00, float 2.000000e+00, float 1.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+00>
+  ret <8 x float> %z
+}

@harrisonGPU
Copy link
Contributor Author

When I implemented this PR: #142250, we found this problem

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right to me.

entry:
%x = fadd nnan contract float %input, 1.000000e+00
%y = freeze float %x
%z = fadd nnan contract float %y, 1.000000e+00
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also have reassoc? I think otherwise the transform would not happen even if nnan was missing, so the test doesn't quite should what you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Initially, I intended to test the case where Op->hasPoisonGeneratingFlags() returns true, so I used the nnan flag as an example:

if (ConsiderFlags && Op->hasPoisonGeneratingFlags())
return true;

However, upon further inspection, I realize this isn't necessary. When we actually call canCreateUndefOrPoison in this context, we explicitly set ConsiderFlags to false, as shown in the following snippet:

  if (DAG.canCreateUndefOrPoison(N0, /*PoisonOnly*/ false,
                                 /*ConsiderFlags*/ false) ||
      N0->getNumValues() != 1 || !N0->hasOneUse())
    return SDValue();

Thus, even if I set the nnan flag, it won't affect this particular test case. Therefore, I've removed the unnecessary tests involving nnan. Additionally, I previously forgot to include tests for fneg, which I have now added.

What do you think?

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fneg tests?

@harrisonGPU
Copy link
Contributor Author

fneg tests?

Thank, I have added it.

@harrisonGPU harrisonGPU requested review from RKSimon and nikic June 2, 2025 09:16
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name should be more specific

Comment on lines 15 to 24
define <8 x float> @freeze_fneg_vec(<8 x float> %input) nounwind {
; CHECK-LABEL: freeze_fneg_vec:
; CHECK: ; %bb.0:
; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; CHECK-NEXT: s_setpc_b64 s[30:31]
%x = fneg <8 x float> %input
%y = freeze <8 x float> %x
%z = fneg <8 x float> %y
ret <8 x float> %z
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the different vector type tests aren't that useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have removed it.

@harrisonGPU
Copy link
Contributor Author

Test name should be more specific

Thanks, I followed the naming convention from llvm/test/CodeGen/X86/freeze-binary.ll, could you please give me some suggestions?

Test name should be more specific

Thanks, I followed the naming convention from llvm/test/CodeGen/X86/freeze-binary.ll, could you please give me some suggestions?

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@harrisonGPU
Copy link
Contributor Author

Thanks

@harrisonGPU harrisonGPU merged commit 0107c93 into llvm:main Jun 3, 2025
11 checks passed
@harrisonGPU harrisonGPU deleted the dag-freeze branch June 3, 2025 11:21
rorth pushed a commit to rorth/llvm-project that referenced this pull request Jun 11, 2025
…not poison generating (llvm#142345)

After revisiting the LLVM Language Reference Manual, it is confirmed
that
plain floating-point operations (`fneg`, `fadd`, `fsub`, `fmul`, `fdiv`,
and `frem`)
propagate poison but do not inherently create new poison values. Thus, 
`SelectionDAG::canCreateUndefOrPoison` should return `false` for these 
operations by default.

Poison generation in FP instructions occurs only when specific fast-math
flags (`nnan`, `ninf`, or the collective fast) are present, as these
flags
explicitly convert NaN or Inf results into poison.

References:

- [`fneg` instruction
documentation](https://llvm.org/docs/LangRef.html#fneg-instruction)
- [`fadd` instruction
documentation](https://llvm.org/docs/LangRef.html#fadd-instruction)
- [`fsub` instruction
documentation](https://llvm.org/docs/LangRef.html#fsub-instruction)
- [`fmul` instruction
documentation](https://llvm.org/docs/LangRef.html#fmul-instruction)
- [`fdiv` instruction
documentation](https://llvm.org/docs/LangRef.html#fdiv-instruction)
- [`frem` instruction
documentation](https://llvm.org/docs/LangRef.html#frem-instruction)
- [Fast-Math Flags
documentation](https://llvm.org/docs/LangRef.html#fast-math-flags)
DhruvSrivastavaX pushed a commit to DhruvSrivastavaX/lldb-for-aix that referenced this pull request Jun 12, 2025
…not poison generating (llvm#142345)

After revisiting the LLVM Language Reference Manual, it is confirmed
that
plain floating-point operations (`fneg`, `fadd`, `fsub`, `fmul`, `fdiv`,
and `frem`)
propagate poison but do not inherently create new poison values. Thus, 
`SelectionDAG::canCreateUndefOrPoison` should return `false` for these 
operations by default.

Poison generation in FP instructions occurs only when specific fast-math
flags (`nnan`, `ninf`, or the collective fast) are present, as these
flags
explicitly convert NaN or Inf results into poison.

References:

- [`fneg` instruction
documentation](https://llvm.org/docs/LangRef.html#fneg-instruction)
- [`fadd` instruction
documentation](https://llvm.org/docs/LangRef.html#fadd-instruction)
- [`fsub` instruction
documentation](https://llvm.org/docs/LangRef.html#fsub-instruction)
- [`fmul` instruction
documentation](https://llvm.org/docs/LangRef.html#fmul-instruction)
- [`fdiv` instruction
documentation](https://llvm.org/docs/LangRef.html#fdiv-instruction)
- [`frem` instruction
documentation](https://llvm.org/docs/LangRef.html#frem-instruction)
- [Fast-Math Flags
documentation](https://llvm.org/docs/LangRef.html#fast-math-flags)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants