AMDGPU: Disable most fmed3 folds for strictfp #139530

arsenm · 2025-05-12T10:28:18Z

No description provided.

arsenm · 2025-05-12T10:28:31Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-05-12T10:28:58Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/139530.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp (+3)
(modified) llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll (+6-6)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index e76396f6ffbb0..1494428cb2bf5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -855,6 +855,9 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
         return IC.replaceInstUsesWith(II, Src);
     }
 
+    if (II.isStrictFP())
+      break;
+
     // Checking for NaN before canonicalization provides better fidelity when
     // mapping other operations onto fmed3 since the order of operands is
     // unchanged.
diff --git a/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll b/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
index 5274ac1093a26..bf94637b36a34 100644
--- a/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
+++ b/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
@@ -494,7 +494,7 @@ define amdgpu_ps float @amdgpu_ps_attr_fmed3_x_y_snan1_f32(float %x, float %y) #
 define float @fmed3_qnan0_x_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_qnan0_x_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3:[0-9]+]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000000000000, float [[X]], float [[Y]]) #[[ATTR5:[0-9]+]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000000000000, float %x, float %y) strictfp
@@ -504,7 +504,7 @@ define float @fmed3_qnan0_x_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_qnan0_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_qnan0_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float 0x7FF8000000000000, float [[Y]]) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0x7FF8000000000000, float %y) strictfp
@@ -514,7 +514,7 @@ define float @fmed3_x_qnan0_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_y_qnan0_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_y_qnan0_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.maxnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float [[Y]], float 0x7FF8000000000000) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 0x7FF8000000000000) strictfp
@@ -524,7 +524,7 @@ define float @fmed3_x_y_qnan0_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_snan1_x_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_snan1_x_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float 0x7FF4000000000000, float [[X]], float [[Y]]) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF4000000000000, float %x, float %y) strictfp
@@ -534,7 +534,7 @@ define float @fmed3_snan1_x_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_snan1_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_snan1_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float 0x7FF4000000000000, float [[Y]]) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0x7FF4000000000000, float %y) strictfp
@@ -544,7 +544,7 @@ define float @fmed3_x_snan1_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_y_snan1_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_y_snan1_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.maxnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float [[Y]], float 0x7FF4000000000000) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 0x7FF4000000000000) strictfp

arsenm · 2025-05-12T18:11:24Z

Merge activity

May 12, 2:11 PM EDT: A user started a stack merge that includes this pull request via Graphite.
May 12, 2:18 PM EDT: Graphite rebased this pull request as part of a merge.
May 12, 2:21 PM EDT: @arsenm merged this pull request with Graphite.

This was referenced May 12, 2025

AMDGPU: Add more tests for fmed3 instcombine folds #139529

Merged

AMDGPU: Use minnum instead of maxnum for fmed3 src2-nan fold #139531

Merged

arsenm added the backend:AMDGPU label May 12, 2025 — with Graphite App

arsenm requested review from jayfoad, Pierre-vh and rampitec May 12, 2025 10:28

arsenm marked this pull request as ready for review May 12, 2025 10:28

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels May 12, 2025

arsenm mentioned this pull request May 12, 2025

AMDGPU: Use minimumnum/maximumnum for fmed3 with amdgpu-ieee=0 #139546

Merged

rampitec approved these changes May 12, 2025

View reviewed changes

arsenm force-pushed the users/arsenm/add-more-fmed3-folding-tests branch 2 times, most recently from 300db3f to a542a18 Compare May 12, 2025 18:16

Base automatically changed from users/arsenm/add-more-fmed3-folding-tests to main May 12, 2025 18:18

AMDGPU: Disable most fmed3 folds for strictfp

50773b4

arsenm force-pushed the users/arsenm/amdgpu/strictfp-disable-fmed3-folds branch from 012d451 to 50773b4 Compare May 12, 2025 18:18

arsenm merged commit 83107e0 into main May 12, 2025
6 of 9 checks passed

arsenm deleted the users/arsenm/amdgpu/strictfp-disable-fmed3-folds branch May 12, 2025 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU: Disable most fmed3 folds for strictfp #139530

AMDGPU: Disable most fmed3 folds for strictfp #139530

Uh oh!

arsenm commented May 12, 2025

Uh oh!

arsenm commented May 12, 2025 •

edited

Loading

Uh oh!

llvmbot commented May 12, 2025 •

edited

Loading

Uh oh!

arsenm commented May 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

AMDGPU: Disable most fmed3 folds for strictfp #139530

AMDGPU: Disable most fmed3 folds for strictfp #139530

Uh oh!

Conversation

arsenm commented May 12, 2025

Uh oh!

arsenm commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Uh oh!

arsenm commented May 12, 2025 •

edited

Loading

llvmbot commented May 12, 2025 •

edited

Loading

arsenm commented May 12, 2025 •

edited

Loading