Skip to content

[InstSimplify] Remove most of undef handling logic #119884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Dec 13, 2024

This patch removes most of the undef handling logic in InstSimplify to avoid introducing new undef values and improve the compile time.
However, some folding logic related with icmp and select are still kept to avoid performance regressions on Rust applications.
Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=d1a7225076218ce224cd29c74259b715b393dc9d&to=8ee1f13c52bfdd468e9522dab50e386c157a8d15&stat=instructions%3Au

Copy link

github-actions bot commented Dec 13, 2024

✅ With the latest revision this PR passed the undef deprecator.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Dec 13, 2024

Missing fold: https://alive2.llvm.org/ce/z/8f6oZo

dtcxzyw added a commit that referenced this pull request Dec 16, 2024
…rms (#120011)

This patch is proposed to reduce the number of selects with undefs
introduced by #119884.
@dtcxzyw dtcxzyw marked this pull request as ready for review December 17, 2024 07:41
@dtcxzyw dtcxzyw requested a review from nikic as a code owner December 17, 2024 07:41
@llvmbot llvmbot added backend:AMDGPU llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Dec 17, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 17, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-analysis

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch removes most of the undef handling logic in InstSimplify. Some folding logic related with icmp and select are still kept to avoid performance regressions on Rust applications.
Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=d1a7225076218ce224cd29c74259b715b393dc9d&to=f3a46a02f75fe6137501ded3dcea5abc922c5a36&stat=instructions%3Au


Patch is 38.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119884.diff

14 Files Affected:

  • (modified) llvm/lib/Analysis/InstructionSimplify.cpp (+8-22)
  • (modified) llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll (+26-24)
  • (modified) llvm/test/Transforms/InstCombine/is_fpclass.ll (-7)
  • (modified) llvm/test/Transforms/InstCombine/vec_demanded_elts.ll (+5-5)
  • (modified) llvm/test/Transforms/InstSimplify/2011-09-05-InsertExtractValue.ll (+1-1)
  • (modified) llvm/test/Transforms/InstSimplify/call.ll (+11-51)
  • (modified) llvm/test/Transforms/InstSimplify/extract-element.ll (+1-1)
  • (modified) llvm/test/Transforms/InstSimplify/gep.ll (+8-8)
  • (modified) llvm/test/Transforms/InstSimplify/icmp-constant.ll (+1-1)
  • (renamed) llvm/test/Transforms/InstSimplify/poison.ll (+66-66)
  • (modified) llvm/test/Transforms/InstSimplify/vscale-inseltpoison.ll (+3-3)
  • (modified) llvm/test/Transforms/InstSimplify/vscale.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopUnroll/wrong_assert_in_peeling.ll (+16-12)
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 3325cd972cf1eb..bf15090d973ea0 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -748,11 +748,6 @@ static Value *simplifySubInst(Value *Op0, Value *Op1, bool IsNSW, bool IsNUW,
   if (isa<PoisonValue>(Op0) || isa<PoisonValue>(Op1))
     return PoisonValue::get(Op0->getType());
 
-  // X - undef -> undef
-  // undef - X -> undef
-  if (Q.isUndefValue(Op0) || Q.isUndefValue(Op1))
-    return UndefValue::get(Op0->getType());
-
   // X - 0 -> X
   if (match(Op1, m_Zero()))
     return Op0;
@@ -5002,10 +4997,6 @@ static Value *simplifyGEPInst(Type *SrcTy, Value *Ptr,
       any_of(Indices, [](const auto *V) { return isa<PoisonValue>(V); }))
     return PoisonValue::get(GEPTy);
 
-  // getelementptr undef, idx -> undef
-  if (Q.isUndefValue(Ptr))
-    return UndefValue::get(GEPTy);
-
   bool IsScalableVec =
       SrcTy->isScalableTy() || any_of(Indices, [](const Value *V) {
         return isa<ScalableVectorType>(V->getType());
@@ -5223,13 +5214,13 @@ static Value *simplifyExtractElementInst(Value *Vec, Value *Idx,
     if (auto *CIdx = dyn_cast<Constant>(Idx))
       return ConstantExpr::getExtractElement(CVec, CIdx);
 
-    if (Q.isUndefValue(Vec))
-      return UndefValue::get(VecVTy->getElementType());
+    if (isa<PoisonValue>(Vec))
+      return PoisonValue::get(VecVTy->getElementType());
   }
 
-  // An undef extract index can be arbitrarily chosen to be an out-of-range
+  // A poison extract index can be arbitrarily chosen to be an out-of-range
   // index value, which would result in the instruction being poison.
-  if (Q.isUndefValue(Idx))
+  if (isa<PoisonValue>(Idx))
     return PoisonValue::get(VecVTy->getElementType());
 
   // If extracting a specified index from the vector, see if we can recursively
@@ -6673,8 +6664,6 @@ Value *llvm::simplifyBinaryIntrinsic(Intrinsic::ID IID, Type *ReturnType,
       return ConstantInt::get(ReturnType, true);
     if ((Mask & fcAllFlags) == 0)
       return ConstantInt::get(ReturnType, false);
-    if (Q.isUndefValue(Op0))
-      return UndefValue::get(ReturnType);
     break;
   }
   case Intrinsic::maxnum:
@@ -6799,13 +6788,10 @@ static Value *simplifyIntrinsic(CallBase *Call, Value *Callee,
   case Intrinsic::fshr: {
     Value *Op0 = Args[0], *Op1 = Args[1], *ShAmtArg = Args[2];
 
-    // If both operands are undef, the result is undef.
-    if (Q.isUndefValue(Op0) && Q.isUndefValue(Op1))
-      return UndefValue::get(F->getReturnType());
-
-    // If shift amount is undef, assume it is zero.
-    if (Q.isUndefValue(ShAmtArg))
-      return Args[IID == Intrinsic::fshl ? 0 : 1];
+    // If any of the operands is poison, the result is poison.
+    if (isa<PoisonValue>(Op0) || isa<PoisonValue>(Op1) ||
+        isa<PoisonValue>(ShAmtArg))
+      return PoisonValue::get(F->getReturnType());
 
     const APInt *ShAmtC;
     if (match(ShAmtArg, m_APInt(ShAmtC))) {
diff --git a/llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll b/llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll
index bd6ef9e088b12f..fc18580276ebc6 100644
--- a/llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll
+++ b/llvm/test/CodeGen/AMDGPU/nested-loop-conditions.ll
@@ -154,7 +154,7 @@ define amdgpu_kernel void @nested_loop_conditions(ptr addrspace(1) nocapture %ar
 ; GCN-NEXT:    v_cmp_lt_i32_e32 vcc, 8, v0
 ; GCN-NEXT:    s_cbranch_vccnz .LBB1_6
 ; GCN-NEXT:  ; %bb.1: ; %bb14.lr.ph
-; GCN-NEXT:    s_load_dword s4, s[0:1], 0x0
+; GCN-NEXT:    s_load_dwordx4 s[4:7], s[0:1], 0x0
 ; GCN-NEXT:    s_branch .LBB1_3
 ; GCN-NEXT:  .LBB1_2: ; %Flow
 ; GCN-NEXT:    ; in Loop: Header=BB1_3 Depth=1
@@ -179,7 +179,7 @@ define amdgpu_kernel void @nested_loop_conditions(ptr addrspace(1) nocapture %ar
 ; GCN-NEXT:    s_cbranch_vccnz .LBB1_4
 ; GCN-NEXT:  ; %bb.5: ; %bb21
 ; GCN-NEXT:    ; in Loop: Header=BB1_3 Depth=1
-; GCN-NEXT:    s_load_dword s4, s[0:1], 0x0
+; GCN-NEXT:    s_load_dwordx4 s[4:7], s[0:1], 0x0
 ; GCN-NEXT:    buffer_load_dword v0, off, s[0:3], 0 glc
 ; GCN-NEXT:    s_waitcnt vmcnt(0)
 ; GCN-NEXT:    v_cmp_lt_i32_e64 s[0:1], 8, v0
diff --git a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
index 5fdb918c875459..680fee2693202b 100644
--- a/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
+++ b/llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
@@ -66,7 +66,7 @@ define double @test_constant_fold_rcp_f64_43() nounwind {
 
 define float @test_constant_fold_rcp_f32_43_strictfp() nounwind strictfp {
 ; CHECK-LABEL: @test_constant_fold_rcp_f32_43_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) #[[ATTR14:[0-9]+]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) #[[ATTR15:[0-9]+]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) strictfp nounwind readnone
@@ -115,7 +115,7 @@ define half @test_constant_fold_sqrt_f16_0() nounwind {
 
 define float @test_constant_fold_sqrt_f32_0() nounwind {
 ; CHECK-LABEL: @test_constant_fold_sqrt_f32_0(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float 0.000000e+00) #[[ATTR15:[0-9]+]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float 0.000000e+00) #[[ATTR16:[0-9]+]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.sqrt.f32(float 0.0) nounwind readnone
@@ -124,7 +124,7 @@ define float @test_constant_fold_sqrt_f32_0() nounwind {
 
 define double @test_constant_fold_sqrt_f64_0() nounwind {
 ; CHECK-LABEL: @test_constant_fold_sqrt_f64_0(
-; CHECK-NEXT:    [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double 0.000000e+00) #[[ATTR15]]
+; CHECK-NEXT:    [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double 0.000000e+00) #[[ATTR16]]
 ; CHECK-NEXT:    ret double [[VAL]]
 ;
   %val = call double @llvm.amdgcn.sqrt.f64(double 0.0) nounwind readnone
@@ -141,7 +141,7 @@ define half @test_constant_fold_sqrt_f16_neg0() nounwind {
 
 define float @test_constant_fold_sqrt_f32_neg0() nounwind {
 ; CHECK-LABEL: @test_constant_fold_sqrt_f32_neg0(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float -0.000000e+00) #[[ATTR15]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float -0.000000e+00) #[[ATTR16]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.sqrt.f32(float -0.0) nounwind readnone
@@ -150,7 +150,7 @@ define float @test_constant_fold_sqrt_f32_neg0() nounwind {
 
 define double @test_constant_fold_sqrt_f64_neg0() nounwind {
 ; CHECK-LABEL: @test_constant_fold_sqrt_f64_neg0(
-; CHECK-NEXT:    [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double -0.000000e+00) #[[ATTR15]]
+; CHECK-NEXT:    [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double -0.000000e+00) #[[ATTR16]]
 ; CHECK-NEXT:    ret double [[VAL]]
 ;
   %val = call double @llvm.amdgcn.sqrt.f64(double -0.0) nounwind readnone
@@ -667,7 +667,7 @@ define i1 @test_class_undef_full_mask_f32() nounwind {
 
 define i1 @test_class_undef_val_f32() nounwind {
 ; CHECK-LABEL: @test_class_undef_val_f32(
-; CHECK-NEXT:    ret i1 undef
+; CHECK-NEXT:    ret i1 false
 ;
   %val = call i1 @llvm.amdgcn.class.f32(float undef, i32 4)
   ret i1 %val
@@ -718,7 +718,7 @@ define i1 @test_class_isnan_f32(float %x) nounwind {
 
 define i1 @test_class_isnan_f32_strict(float %x) nounwind strictfp {
 ; CHECK-LABEL: @test_class_isnan_f32_strict(
-; CHECK-NEXT:    [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float [[X:%.*]], i32 3) #[[ATTR16:[0-9]+]]
+; CHECK-NEXT:    [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float [[X:%.*]], i32 3) #[[ATTR17:[0-9]+]]
 ; CHECK-NEXT:    ret i1 [[VAL]]
 ;
   %val = call i1 @llvm.amdgcn.class.f32(float %x, i32 3) strictfp
@@ -736,7 +736,7 @@ define i1 @test_class_is_p0_n0_f32(float %x) nounwind {
 
 define i1 @test_class_is_p0_n0_f32_strict(float %x) nounwind strictfp {
 ; CHECK-LABEL: @test_class_is_p0_n0_f32_strict(
-; CHECK-NEXT:    [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float [[X:%.*]], i32 96) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float [[X:%.*]], i32 96) #[[ATTR17]]
 ; CHECK-NEXT:    ret i1 [[VAL]]
 ;
   %val = call i1 @llvm.amdgcn.class.f32(float %x, i32 96) strictfp
@@ -2000,7 +2000,7 @@ define i64 @icmp_constant_inputs_false() {
 
 define i64 @icmp_constant_inputs_true() {
 ; CHECK-LABEL: @icmp_constant_inputs_true(
-; CHECK-NEXT:    [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0:![0-9]+]]) #[[ATTR17:[0-9]+]]
+; CHECK-NEXT:    [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0:![0-9]+]]) #[[ATTR18:[0-9]+]]
 ; CHECK-NEXT:    ret i64 [[RESULT]]
 ;
   %result = call i64 @llvm.amdgcn.icmp.i64.i32(i32 9, i32 8, i32 34)
@@ -2707,7 +2707,7 @@ define i64 @fcmp_constant_inputs_false() {
 
 define i64 @fcmp_constant_inputs_true() {
 ; CHECK-LABEL: @fcmp_constant_inputs_true(
-; CHECK-NEXT:    [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0]]) #[[ATTR17]]
+; CHECK-NEXT:    [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0]]) #[[ATTR18]]
 ; CHECK-NEXT:    ret i64 [[RESULT]]
 ;
   %result = call i64 @llvm.amdgcn.fcmp.i64.f32(float 2.0, float 4.0, i32 4)
@@ -5829,7 +5829,7 @@ define double @trig_preop_constfold_neg32_segment() {
 
 define double @trig_preop_constfold_strictfp() strictfp {
 ; CHECK-LABEL: @trig_preop_constfold_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) #[[ATTR17]]
 ; CHECK-NEXT:    ret double [[VAL]]
 ;
   %val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) strictfp
@@ -6198,7 +6198,7 @@ define half @test_constant_fold_log_f16_neg10() {
 
 define float @test_constant_fold_log_f32_qnan_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_log_f32_qnan_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float 0x7FF8000000000000) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float 0x7FF8000000000000) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.log.f32(float 0x7FF8000000000000) strictfp
@@ -6207,7 +6207,7 @@ define float @test_constant_fold_log_f32_qnan_strictfp() strictfp {
 
 define float @test_constant_fold_log_f32_0_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_log_f32_0_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float 0.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float 0.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.log.f32(float 0.0) strictfp
@@ -6216,7 +6216,7 @@ define float @test_constant_fold_log_f32_0_strictfp() strictfp {
 
 define float @test_constant_fold_log_f32_neg0_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_log_f32_neg0_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float -0.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float -0.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.log.f32(float -0.0) strictfp
@@ -6225,7 +6225,7 @@ define float @test_constant_fold_log_f32_neg0_strictfp() strictfp {
 
 define float @test_constant_fold_log_f32_neg_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_log_f32_neg_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float -1.000000e+01) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float -1.000000e+01) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.log.f32(float -10.0) strictfp
@@ -6242,7 +6242,7 @@ define float @test_constant_fold_log_f32_pinf_strictfp() strictfp {
 
 define float @test_constant_fold_log_f32_ninf_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_log_f32_ninf_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float 0xFFF0000000000000) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.log.f32(float 0xFFF0000000000000) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.log.f32(float 0xFFF0000000000000) strictfp
@@ -6444,7 +6444,7 @@ define half @test_constant_fold_exp2_f16_neg10() {
 
 define float @test_constant_fold_exp2_f32_qnan_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_qnan_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 0x7FF8000000000000) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 0x7FF8000000000000) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float 0x7FF8000000000000) strictfp
@@ -6453,7 +6453,7 @@ define float @test_constant_fold_exp2_f32_qnan_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_0_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_0_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 0.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 0.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float 0.0) strictfp
@@ -6462,7 +6462,7 @@ define float @test_constant_fold_exp2_f32_0_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_neg0_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_neg0_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -0.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -0.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float -0.0) strictfp
@@ -6471,7 +6471,7 @@ define float @test_constant_fold_exp2_f32_neg0_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_1_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_1_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 1.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 1.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float 1.0) strictfp
@@ -6480,7 +6480,7 @@ define float @test_constant_fold_exp2_f32_1_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_neg1_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_neg1_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -1.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -1.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float -1.0) strictfp
@@ -6489,7 +6489,7 @@ define float @test_constant_fold_exp2_f32_neg1_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_2_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_2_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 2.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float 2.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float 2.0) strictfp
@@ -6498,7 +6498,7 @@ define float @test_constant_fold_exp2_f32_2_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_neg2_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_neg2_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -2.000000e+00) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -2.000000e+00) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float -2.0) strictfp
@@ -6507,7 +6507,7 @@ define float @test_constant_fold_exp2_f32_neg2_strictfp() strictfp {
 
 define float @test_constant_fold_exp2_f32_neg_strictfp() strictfp {
 ; CHECK-LABEL: @test_constant_fold_exp2_f32_neg_strictfp(
-; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -1.000000e+01) #[[ATTR16]]
+; CHECK-NEXT:    [[VAL:%.*]] = call float @llvm.amdgcn.exp2.f32(float -1.000000e+01) #[[ATTR17]]
 ; CHECK-NEXT:    ret float [[VAL]]
 ;
   %val = call float @llvm.amdgcn.exp2.f32(float -10.0) strictfp
@@ -6555,6 +6555,7 @@ declare i32 @llvm.amdgcn.prng.b32(i32)
 define i32 @prng_undef_i32() {
 ; CHECK-LABEL: @prng_undef_i32(
 ; CHECK-NEXT:    ret i32 undef
+;
   %prng = call i32 @llvm.amdgcn.prng.b32(i32 undef)
   ret i32 %prng
 }
@@ -6562,6 +6563,7 @@ define i32 @prng_undef_i32() {
 define i32 @prng_poison_i32() {
 ; CHECK-LABEL: @prng_poison_i32(
 ; CHECK-NEXT:    ret i32 poison
+;
   %prng = call i32 @llvm.amdgcn.prng.b32(i32 poison)
   ret i32 %prng
 }
diff --git a/llvm/test/Transforms/InstCombine/is_fpclass.ll b/llvm/test/Transforms/InstCombine/is_fpclass.ll
index c1809b8bec61cb..c39587ce8a2abb 100644
--- a/llvm/test/Transforms/InstCombine/is_fpclass.ll
+++ b/llvm/test/Transforms/InstCombine/is_fpclass.ll
@@ -59,13 +59,6 @@ define i1 @test_class_poison_full_mask_f32() {
   ret i1 %val
 }
 
-define i1 @test_class_undef_val_f32() {
-; CHECK-LABEL: @test_class_undef_val_f32(
-; CHECK-NEXT:    ret i1 undef
-;
-  %val = call i1 @llvm.is.fpclass.f32(float undef, i32 4)
-  ret i1 %val
-}
 
 define i1 @test_class_poison_val_f32() {
 ; CHECK-LABEL: @test_class_poison_val_f32(
diff --git a/llvm/test/Transforms/InstCombine/vec_demanded_elts.ll b/llvm/test/Transforms/InstCombine/vec_demanded_elts.ll
index ee7ef9955e643f..e6c4034cecd932 100644
--- a/llvm/test/Transforms/InstCombine/vec_demanded_elts.ll
+++ b/llvm/test/Transforms/InstCombine/vec_demanded_elts.ll
@@ -628,12 +628,12 @@ define <2 x ptr> @gep_all_lanes_undef(ptr %base, i64 %idx) {;
   ret <2 x ptr> %gep
 }
 
-define ptr @gep_demanded_lane_undef(ptr %base, i64 %idx) {
-; CHECK-LABEL: @gep_demanded_lane_undef(
-; CHECK-NEXT:    ret ptr undef
+define ptr @gep_demanded_lane_poison(ptr %base, i64 %idx) {
+; CHECK-LABEL: @gep_demanded_lane_poison(
+; CHECK-NEXT:    ret ptr poison
 ;
-  %basevec = insertelement <2 x ptr> undef, ptr %base, i32 0
-  %idxvec = insertelement <2 x i64> undef, i64 %idx, i32 1
+  %basevec = insertelement <2 x ptr> poison, ptr %base, i32 0
+  %idxvec = insertelement <2 x i64> poison, i64 %idx, i32 1
   %gep = getelementptr i32, <2 x ptr> %basevec, <2 x i64> %idxvec
   %ee = extractelement <2 x ptr> %gep, i32 1
   ret ptr %ee
diff --git a/llvm/test/Transforms/InstSimplify/2011-09-05-InsertExtractValue.ll b/llvm/test/Transforms/InstSimplify/2011-09-05-InsertExtractValue.ll
index 9c364d16badee7..bf6a91c2707b5c 100644
--- a/llvm/test/Transforms/InstSimplify/2011-09-05-InsertExtractValue.ll
+++ b/llvm/test/Transforms/InstSimplify/2011-09-05-InsertExtractValue.ll
@@ -66,6 +66,6 @@ define i32 @test5(<4 x i32> %V) {
 ; CHECK-LABEL: @test5(
 ; CHECK-NEXT:    ret i32 poison
 ;
-  %extract = extractelement <4 x i32> %V, i32 undef
+  %extract = extractelement <4 x i32> %V, i32 poison
   ret i32 %extract
 }
diff --git a/llvm/test/Transforms/InstSimplify/call.ll b/llvm/test/Transforms/InstSimplify/call.ll
index 67d5c4dbfb2e7d..97e7769a40ba78 100644
--- a/llvm/test/Transforms/InstSimplify/call.ll
+++ b/llvm/test/Transforms/InstSimplify/call.ll
@@ -826,48 +826,11 @@ define <2 x i8> @rotr_zero_shift_guard_splat(<2 x i8> %x, <2 x i8> %sh) {
   ret <2 x i8> %s
 }
 
-; If first two operands of funnel shift are undef, the result is undef
-
-define i8 @fshl_ops_undef(i8 %shamt) {
-; CHECK-LABEL: @fshl_ops_undef(
-; CHECK-NEXT:    ret i8 undef
-;
-  %r = call i8 @llvm.fshl.i8(i8 undef, i8 undef, i8 %shamt)
-  ret i8 %r
-}
-
-define i9 @fshr_ops_undef(i9 %shamt) {
-; CHECK-LABEL: @fshr_ops_undef(
-; CHECK-NEXT:    ret i9 undef
-;
-  %r = call i9 @llvm.fshr.i9(i9 undef, i9 undef, i9 %shamt)
-  ret i9 %r
-}
-
-; If shift amount is undef, treat it as zero, returning operand 0 or 1
-
-define i8 @fshl_shift_undef(i8 %x, i8 %y) {
-; CHECK-LABEL: @fshl_shift_undef(
-; CHECK-NEXT:    ret i8 [[X:%.*]]
-;
-  %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 undef)
-  ret i8 %r
-}
-
-define i9 @fshr_shift_undef(i9 %x, i9 %y) {
-; CHECK-LABEL: @fshr_shift_undef(
-; CHE...
[truncated]

@dtcxzyw dtcxzyw requested a review from arsenm December 17, 2024 08:19
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with doing this, but can you please outline the motivation for the change in the PR description?

@@ -826,81 +826,43 @@ define <2 x i8> @rotr_zero_shift_guard_splat(<2 x i8> %x, <2 x i8> %sh) {
ret <2 x i8> %s
}

; If first two operands of funnel shift are undef, the result is undef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we still keep the test coverage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poison value handling has been tested by the following tests fshl_ops_poisonXXX. Should we keep these undef tests even if we finally remove undef?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to keep the undef test coverage (just showing that no fold happens, which still ensures that no incorrect fold happens).

@nunoplopes
Copy link
Member

This looks great, thank you!
As we move away from undef, most of the remaining uses are related with bitfields, so arithmetic on those isn't important (only bitwise operations).

That said, I wouldn't remove every single undef transformation because freeze poison with a single use can be optimized like undef in most cases. Right now, instcombine eagerly transformes those into 0, but that's suboptimal. We need to teach instcombine how to optimize freeze poison to eventually remove that freeze poison -> 0 transformation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:analysis Includes value tracking, cost tables and constant folding llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants