Skip to content

AMDGPU: Make sqrt and rsq intrinsics propagate poison #130914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Mar 12, 2025

No description provided.

Copy link
Contributor Author

arsenm commented Mar 12, 2025

@llvmbot
Copy link
Member

llvmbot commented Mar 12, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/130914.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp (+2)
  • (modified) llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll (+24)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index 6f6556365ebf6..5314738b2b8ac 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -548,6 +548,8 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
   case Intrinsic::amdgcn_sqrt:
   case Intrinsic::amdgcn_rsq: {
     Value *Src = II.getArgOperand(0);
+    if (isa<PoisonValue>(Src))
+      return IC.replaceInstUsesWith(II, Src);
 
     // TODO: Move to ConstantFolding/InstSimplify?
     if (isa<UndefValue>(Src)) {
diff --git a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
index 42ddc71dab848..fca3860240294 100644
--- a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
+++ b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
@@ -89,6 +89,14 @@ declare half @llvm.amdgcn.sqrt.f16(half) nounwind readnone
 declare float @llvm.amdgcn.sqrt.f32(float) nounwind readnone
 declare double @llvm.amdgcn.sqrt.f64(double) nounwind readnone
 
+define half @test_constant_fold_sqrt_f16_poison() nounwind {
+; CHECK-LABEL: @test_constant_fold_sqrt_f16_poison(
+; CHECK-NEXT:    ret half poison
+;
+  %val = call half @llvm.amdgcn.sqrt.f16(half poison) nounwind readnone
+  ret half %val
+}
+
 define half @test_constant_fold_sqrt_f16_undef() nounwind {
 ; CHECK-LABEL: @test_constant_fold_sqrt_f16_undef(
 ; CHECK-NEXT:    ret half 0xH7E00
@@ -97,6 +105,14 @@ define half @test_constant_fold_sqrt_f16_undef() nounwind {
   ret half %val
 }
 
+define float @test_constant_fold_sqrt_f32_poison() nounwind {
+; CHECK-LABEL: @test_constant_fold_sqrt_f32_poison(
+; CHECK-NEXT:    ret float poison
+;
+  %val = call float @llvm.amdgcn.sqrt.f32(float poison) nounwind readnone
+  ret float %val
+}
+
 define float @test_constant_fold_sqrt_f32_undef() nounwind {
 ; CHECK-LABEL: @test_constant_fold_sqrt_f32_undef(
 ; CHECK-NEXT:    ret float 0x7FF8000000000000
@@ -234,6 +250,14 @@ define double @test_amdgcn_sqrt_f64(double %arg) {
 
 declare float @llvm.amdgcn.rsq.f32(float) nounwind readnone
 
+define float @test_constant_fold_rsq_f32_poison() nounwind {
+; CHECK-LABEL: @test_constant_fold_rsq_f32_poison(
+; CHECK-NEXT:    ret float poison
+;
+  %val = call float @llvm.amdgcn.rsq.f32(float poison) nounwind readnone
+  ret float %val
+}
+
 define float @test_constant_fold_rsq_f32_undef() nounwind {
 ; CHECK-LABEL: @test_constant_fold_rsq_f32_undef(
 ; CHECK-NEXT:    ret float 0x7FF8000000000000

@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Mar 12, 2025
@@ -548,6 +548,8 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
case Intrinsic::amdgcn_sqrt:
case Intrinsic::amdgcn_rsq: {
Value *Src = II.getArgOperand(0);
if (isa<PoisonValue>(Src))
return IC.replaceInstUsesWith(II, Src);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does undef give QNaN while poison give poison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've done this for a while for FP ops. I think the reasoning is that if the original value could have been a signaling nan or denormal, that would go through canonicalization. We're still guaranteeing a canonical value by returning a qnan (although technically we don't guarantee this for generic math ops, but I guess we can maintain it for target intrinsics)

Copy link
Contributor Author

arsenm commented Mar 13, 2025

Merge activity

  • Mar 12, 10:54 PM EDT: A user started a stack merge that includes this pull request via Graphite.
  • Mar 12, 10:59 PM EDT: Graphite rebased this pull request as part of a merge.
  • Mar 12, 11:01 PM EDT: A user merged this pull request with Graphite.

@arsenm arsenm force-pushed the users/arsenm/amdgpu/make-rcp-intrinsic-propagate-poison branch from 8fd7cd3 to 2b21ffa Compare March 13, 2025 02:55
Base automatically changed from users/arsenm/amdgpu/make-rcp-intrinsic-propagate-poison to main March 13, 2025 02:58
@arsenm arsenm force-pushed the users/arsenm/amdgpu/make-sqrt-rsq-intrinsics-propagate-poison branch from 9babf2e to 97c02c4 Compare March 13, 2025 02:59
@arsenm arsenm merged commit d8f17b3 into main Mar 13, 2025
5 of 9 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/make-sqrt-rsq-intrinsics-propagate-poison branch March 13, 2025 03:01
frederik-h pushed a commit to frederik-h/llvm-project that referenced this pull request Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants