[NFC][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants #142628

harrisonGPU · 2025-06-03T15:09:17Z

After this PR #142345, combining
freeze on fmul (without nnan) followed by fadd or fsub into a single
fma is supported.
This patch adds lit tests to verify the optimization behavior for both nnan
and non-nnan variants.

llvmbot · 2025-06-03T15:09:54Z

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-amdgpu

Author: Harrison Hao (harrisonGPU)

Changes

After this PR #142345, combining freeze on fmul (without nnan) followed by fadd into a single fma is supported.
This patch adds lit tests to verify the optimization behavior for both nnan and non-nnan variants.

Closes: #141622

Full diff: https://github.com/llvm/llvm-project/pull/142628.diff

1 Files Affected:

(added) llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll (+106)

diff --git a/llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll b/llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll
new file mode 100644
index 0000000000000..dbf5636ae03ed
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll
@@ -0,0 +1,106 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 < %s | FileCheck %s
+
+define float @fma_from_freeze_mul_add_left(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_add_left:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_fma_f32 v0, v0, v1, 1.0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %add = fadd reassoc nsz arcp contract afn float %mul.fr, 1.000000e+00
+  ret float %add
+}
+
+define float @fma_from_freeze_mul_add_left_with_nnan(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_add_left_with_nnan:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_mul_f32_e32 v0, v0, v1
+; CHECK-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; CHECK-NEXT:    v_add_f32_e32 v0, 1.0, v0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nnan nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %add = fadd reassoc nnan nsz arcp contract afn float %mul.fr, 1.000000e+00
+  ret float %add
+}
+
+define float @fma_from_freeze_mul_add_right(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_add_right:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_fma_f32 v0, v0, v1, 1.0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %add = fadd reassoc nsz arcp contract afn float 1.000000e+00, %mul.fr
+  ret float %add
+}
+
+define float @fma_from_freeze_mul_add_right_with_nnan(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_add_right_with_nnan:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_mul_f32_e32 v0, v0, v1
+; CHECK-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; CHECK-NEXT:    v_add_f32_e32 v0, 1.0, v0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nnan nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %add = fadd reassoc nnan nsz arcp contract afn float 1.000000e+00, %mul.fr
+  ret float %add
+}
+
+define float @fma_from_freeze_mul_sub_left(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_sub_left:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_fma_f32 v0, v0, v1, -1.0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %sub = fsub reassoc nsz arcp contract afn float %mul.fr, 1.000000e+00
+  ret float %sub
+}
+
+define float @fma_from_freeze_mul_sub_left_with_nnan(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_sub_left_with_nnan:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_mul_f32_e32 v0, v0, v1
+; CHECK-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; CHECK-NEXT:    v_add_f32_e32 v0, -1.0, v0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nnan nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %sub = fsub reassoc nnan nsz arcp contract afn float %mul.fr, 1.000000e+00
+  ret float %sub
+}
+
+define float @fma_from_freeze_mul_sub_right(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_sub_right:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_fma_f32 v0, -v0, v1, 1.0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %sub = fsub reassoc nsz arcp contract afn float 1.000000e+00, %mul.fr
+  ret float %sub
+}
+
+define float @fma_from_freeze_mul_sub_right_with_nnan(float %x, float %y) {
+; CHECK-LABEL: fma_from_freeze_mul_sub_right_with_nnan:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT:    v_mul_f32_e32 v0, v0, v1
+; CHECK-NEXT:    s_delay_alu instid0(VALU_DEP_1)
+; CHECK-NEXT:    v_sub_f32_e32 v0, 1.0, v0
+; CHECK-NEXT:    s_setpc_b64 s[30:31]
+  %mul = fmul reassoc nnan nsz arcp contract afn float %x, %y
+  %mul.fr = freeze float %mul
+  %sub = fsub reassoc nnan nsz arcp contract afn float 1.000000e+00, %mul.fr
+  ret float %sub
+}

llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll

jayfoad · 2025-06-03T16:13:45Z

After this PR #142345, combining freeze on fmul (without nnan) followed by fadd or fsub into a single fma is supported. This patch adds lit tests to verify the optimization behavior for both nnan and non-nnan variants.

Thanks - but making this work in the presence of ninf and nnan is pretty important too.

harrisonGPU · 2025-06-03T16:17:16Z

After this PR #142345, combining freeze on fmul (without nnan) followed by fadd or fsub into a single fma is supported. This patch adds lit tests to verify the optimization behavior for both nnan and non-nnan variants.

Thanks - but making this work in the presence of ninf and nnan is pretty important too.

I will implement it in this PR #142250

jayfoad

LGTM

llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll

…riants

…riants (llvm#142628) `freeze` on `fmul` (without `nnan`) followed by `fadd` or `fsub` into a single `fma` is supported. This patch adds lit tests to verify the optimization behavior for both nnan and non-nnan variants.

harrisonGPU requested review from jayfoad, arsenm, nikic, RKSimon and shiltian June 3, 2025 15:09

llvmbot added the backend:AMDGPU label Jun 3, 2025

harrisonGPU added llvm:SelectionDAG SelectionDAGISel as well and removed llvm:SelectionDAG SelectionDAGISel as well labels Jun 3, 2025

shiltian reviewed Jun 3, 2025

View reviewed changes

llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll Outdated Show resolved Hide resolved

harrisonGPU changed the title ~~[DAG][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants~~ [NFC][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants Jun 4, 2025

jayfoad approved these changes Jun 4, 2025

View reviewed changes

arsenm reviewed Jun 4, 2025

View reviewed changes

llvm/test/CodeGen/AMDGPU/fold-freeze-fmul-to-fma.ll Outdated Show resolved Hide resolved

harrisonGPU requested a review from arsenm June 4, 2025 12:07

arsenm approved these changes Jun 4, 2025

View reviewed changes

harrisonGPU added 3 commits June 4, 2025 15:27

[DAG][AMDGPU] Add lit tests for FMA combining with freeze and nnan va…

0904583

…riants

[AMDGPU] Update.

13d3821

[AMDGPU] Update lit test.

9e6d5f8

harrisonGPU force-pushed the dag-fold-test branch from c2626dc to 9e6d5f8 Compare June 4, 2025 15:27

harrisonGPU merged commit 8ca220f into llvm:main Jun 5, 2025
11 checks passed

harrisonGPU deleted the dag-fold-test branch June 5, 2025 01:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NFC][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants #142628

[NFC][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants #142628

Uh oh!

harrisonGPU commented Jun 3, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jun 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

jayfoad commented Jun 3, 2025

Uh oh!

harrisonGPU commented Jun 3, 2025

Uh oh!

jayfoad left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[NFC][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants #142628

[NFC][AMDGPU] Add lit tests for FMA combining with freeze and nnan variants #142628

Uh oh!

Conversation

harrisonGPU commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jayfoad commented Jun 3, 2025

Uh oh!

harrisonGPU commented Jun 3, 2025

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harrisonGPU commented Jun 3, 2025 •

edited

Loading

llvmbot commented Jun 3, 2025 •

edited

Loading