Skip to content

AMDGPU: Disable most fmed3 folds for strictfp #139530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 12, 2025

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented May 12, 2025

No description provided.

Copy link
Contributor Author

arsenm commented May 12, 2025

@llvmbot
Copy link
Member

llvmbot commented May 12, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/139530.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp (+3)
  • (modified) llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll (+6-6)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index e76396f6ffbb0..1494428cb2bf5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -855,6 +855,9 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
         return IC.replaceInstUsesWith(II, Src);
     }
 
+    if (II.isStrictFP())
+      break;
+
     // Checking for NaN before canonicalization provides better fidelity when
     // mapping other operations onto fmed3 since the order of operands is
     // unchanged.
diff --git a/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll b/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
index 5274ac1093a26..bf94637b36a34 100644
--- a/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
+++ b/llvm/test/Transforms/InstCombine/AMDGPU/fmed3.ll
@@ -494,7 +494,7 @@ define amdgpu_ps float @amdgpu_ps_attr_fmed3_x_y_snan1_f32(float %x, float %y) #
 define float @fmed3_qnan0_x_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_qnan0_x_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3:[0-9]+]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000000000000, float [[X]], float [[Y]]) #[[ATTR5:[0-9]+]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000000000000, float %x, float %y) strictfp
@@ -504,7 +504,7 @@ define float @fmed3_qnan0_x_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_qnan0_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_qnan0_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float 0x7FF8000000000000, float [[Y]]) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0x7FF8000000000000, float %y) strictfp
@@ -514,7 +514,7 @@ define float @fmed3_x_qnan0_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_y_qnan0_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_y_qnan0_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.maxnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float [[Y]], float 0x7FF8000000000000) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 0x7FF8000000000000) strictfp
@@ -524,7 +524,7 @@ define float @fmed3_x_y_qnan0_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_snan1_x_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_snan1_x_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float 0x7FF4000000000000, float [[X]], float [[Y]]) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF4000000000000, float %x, float %y) strictfp
@@ -534,7 +534,7 @@ define float @fmed3_snan1_x_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_snan1_y_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_snan1_y_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.minnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float 0x7FF4000000000000, float [[Y]]) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0x7FF4000000000000, float %y) strictfp
@@ -544,7 +544,7 @@ define float @fmed3_x_snan1_y_f32_strictfp(float %x, float %y) #2 {
 define float @fmed3_x_y_snan1_f32_strictfp(float %x, float %y) #2 {
 ; CHECK-LABEL: define float @fmed3_x_y_snan1_f32_strictfp(
 ; CHECK-SAME: float [[X:%.*]], float [[Y:%.*]]) #[[ATTR3]] {
-; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.maxnum.f32(float [[X]], float [[Y]])
+; CHECK-NEXT:    [[MED3:%.*]] = call float @llvm.amdgcn.fmed3.f32(float [[X]], float [[Y]], float 0x7FF4000000000000) #[[ATTR5]]
 ; CHECK-NEXT:    ret float [[MED3]]
 ;
   %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 0x7FF4000000000000) strictfp

@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels May 12, 2025
Copy link
Contributor Author

arsenm commented May 12, 2025

Merge activity

  • May 12, 2:11 PM EDT: A user started a stack merge that includes this pull request via Graphite.
  • May 12, 2:18 PM EDT: Graphite rebased this pull request as part of a merge.
  • May 12, 2:21 PM EDT: @arsenm merged this pull request with Graphite.

@arsenm arsenm force-pushed the users/arsenm/add-more-fmed3-folding-tests branch 2 times, most recently from 300db3f to a542a18 Compare May 12, 2025 18:16
Base automatically changed from users/arsenm/add-more-fmed3-folding-tests to main May 12, 2025 18:18
@arsenm arsenm force-pushed the users/arsenm/amdgpu/strictfp-disable-fmed3-folds branch from 012d451 to 50773b4 Compare May 12, 2025 18:18
@arsenm arsenm merged commit 83107e0 into main May 12, 2025
6 of 9 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/strictfp-disable-fmed3-folds branch May 12, 2025 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants