-
Notifications
You must be signed in to change notification settings - Fork 14.3k
AMDGPU: Make sqrt and rsq intrinsics propagate poison #130914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Make sqrt and rsq intrinsics propagate poison #130914
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/130914.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index 6f6556365ebf6..5314738b2b8ac 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -548,6 +548,8 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
case Intrinsic::amdgcn_sqrt:
case Intrinsic::amdgcn_rsq: {
Value *Src = II.getArgOperand(0);
+ if (isa<PoisonValue>(Src))
+ return IC.replaceInstUsesWith(II, Src);
// TODO: Move to ConstantFolding/InstSimplify?
if (isa<UndefValue>(Src)) {
diff --git a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
index 42ddc71dab848..fca3860240294 100644
--- a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
+++ b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
@@ -89,6 +89,14 @@ declare half @llvm.amdgcn.sqrt.f16(half) nounwind readnone
declare float @llvm.amdgcn.sqrt.f32(float) nounwind readnone
declare double @llvm.amdgcn.sqrt.f64(double) nounwind readnone
+define half @test_constant_fold_sqrt_f16_poison() nounwind {
+; CHECK-LABEL: @test_constant_fold_sqrt_f16_poison(
+; CHECK-NEXT: ret half poison
+;
+ %val = call half @llvm.amdgcn.sqrt.f16(half poison) nounwind readnone
+ ret half %val
+}
+
define half @test_constant_fold_sqrt_f16_undef() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f16_undef(
; CHECK-NEXT: ret half 0xH7E00
@@ -97,6 +105,14 @@ define half @test_constant_fold_sqrt_f16_undef() nounwind {
ret half %val
}
+define float @test_constant_fold_sqrt_f32_poison() nounwind {
+; CHECK-LABEL: @test_constant_fold_sqrt_f32_poison(
+; CHECK-NEXT: ret float poison
+;
+ %val = call float @llvm.amdgcn.sqrt.f32(float poison) nounwind readnone
+ ret float %val
+}
+
define float @test_constant_fold_sqrt_f32_undef() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f32_undef(
; CHECK-NEXT: ret float 0x7FF8000000000000
@@ -234,6 +250,14 @@ define double @test_amdgcn_sqrt_f64(double %arg) {
declare float @llvm.amdgcn.rsq.f32(float) nounwind readnone
+define float @test_constant_fold_rsq_f32_poison() nounwind {
+; CHECK-LABEL: @test_constant_fold_rsq_f32_poison(
+; CHECK-NEXT: ret float poison
+;
+ %val = call float @llvm.amdgcn.rsq.f32(float poison) nounwind readnone
+ ret float %val
+}
+
define float @test_constant_fold_rsq_f32_undef() nounwind {
; CHECK-LABEL: @test_constant_fold_rsq_f32_undef(
; CHECK-NEXT: ret float 0x7FF8000000000000
|
@@ -548,6 +548,8 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { | |||
case Intrinsic::amdgcn_sqrt: | |||
case Intrinsic::amdgcn_rsq: { | |||
Value *Src = II.getArgOperand(0); | |||
if (isa<PoisonValue>(Src)) | |||
return IC.replaceInstUsesWith(II, Src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does undef
give QNaN
while poison
give poison
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've done this for a while for FP ops. I think the reasoning is that if the original value could have been a signaling nan or denormal, that would go through canonicalization. We're still guaranteeing a canonical value by returning a qnan (although technically we don't guarantee this for generic math ops, but I guess we can maintain it for target intrinsics)
8fd7cd3
to
2b21ffa
Compare
9babf2e
to
97c02c4
Compare
No description provided.