Skip to content

[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 5, 2024

Conversation

ElvisWang123
Copy link
Contributor

This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435

@llvmbot llvmbot added backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding labels Jul 5, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 5, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Elvis Wang (ElvisWang123)

Changes

This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435


Patch is 67.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97797.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+17)
  • (modified) llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll (+603)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d603138773de4..63a554f4740dc 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -985,6 +985,23 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
       return getArithmeticInstrCost(*FOp, ICA.getReturnType(), CostKind);
     break;
   }
+  // vp int cast ops.
+  case Intrinsic::vp_trunc:
+  case Intrinsic::vp_zext:
+  case Intrinsic::vp_sext:
+  // vp float cast ops.
+  case Intrinsic::vp_fptoui:
+  case Intrinsic::vp_fptosi:
+  case Intrinsic::vp_uitofp:
+  case Intrinsic::vp_sitofp:
+  case Intrinsic::vp_fptrunc:
+  case Intrinsic::vp_fpext: {
+    std::optional<unsigned> FOp =
+        VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
+    if (FOp && !ICA.getArgTypes().empty())
+      return getCastInstrCost(*FOp, ICA.getReturnType(), ICA.getArgTypes()[0], TTI::CastContextHint::None, CostKind);
+    break;
+  }
   }
 
   if (ST->hasVInstructions() && RetTy->isVectorTy()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
index 87ffb23dcb88e..2dc1efc7b5265 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
@@ -19,6 +19,609 @@ define void @unsupported_fp_ops(<vscale x 4 x float> %vec, i32 %extraarg) {
   ret void
 }
 
+define void @int_truncate() {
+; CHECK-LABEL: 'int_truncate'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_truncate'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  trunc i32 undef to i8
+  trunc <1 x i32> undef to <1 x i8>
+  trunc <2 x i32> undef to <2 x i8>
+  trunc <4 x i32> undef to <4 x i8>
+  trunc <8 x i32> undef to <8 x i8>
+  trunc <16 x i32> undef to <16 x i8>
+  trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+  trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+  trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+  trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+  call <1 x i8> @llvm.vp.trunc.v1i32.v1i8(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i8> @llvm.vp.trunc.v2i32.v2i8(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i8> @llvm.vp.trunc.v4i32.v4i8(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i8> @llvm.vp.trunc.v8i32.v8i8(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i8> @llvm.vp.trunc.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i32.nxv1i8(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i32.nxv2i8(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i32.nxv4i8(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i32.nxv8i8(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_zext() {
+; CHECK-LABEL: 'int_zext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_zext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  zext i32 undef to i64
+  zext <1 x i32> undef to <1 x i64>
+  zext <2 x i32> undef to <2 x i64>
+  zext <4 x i32> undef to <4 x i64>
+  zext <8 x i32> undef to <8 x i64>
+  zext <16 x i32> undef to <16 x i64>
+  zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+  zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+  zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+  zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+  call <1 x i64> @llvm.vp.zext.v1i32.v1i64(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i64> @llvm.vp.zext.v2i32.v2i64(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i64> @llvm.vp.zext.v4i32.v4i64(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i64> @llvm.vp.zext.v8i32.v8i64(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i64> @llvm.vp.zext.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i64> @llvm.vp.zext.nxv1i32.nxv1i64(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i64> @llvm.vp.zext.nxv2i32.nxv2i64(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i64> @llvm.vp.zext.nxv4i32.nxv4i64(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i64> @llvm.vp.zext.nxv8i32.nxv8i64(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_sext() {
+; CHECK-LABEL: 'int_sext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.sext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.sext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.sext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.sext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.sext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.sext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.sext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_sext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 5, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Elvis Wang (ElvisWang123)

Changes

This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435


Patch is 67.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97797.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+17)
  • (modified) llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll (+603)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d603138773de4..63a554f4740dc 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -985,6 +985,23 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
       return getArithmeticInstrCost(*FOp, ICA.getReturnType(), CostKind);
     break;
   }
+  // vp int cast ops.
+  case Intrinsic::vp_trunc:
+  case Intrinsic::vp_zext:
+  case Intrinsic::vp_sext:
+  // vp float cast ops.
+  case Intrinsic::vp_fptoui:
+  case Intrinsic::vp_fptosi:
+  case Intrinsic::vp_uitofp:
+  case Intrinsic::vp_sitofp:
+  case Intrinsic::vp_fptrunc:
+  case Intrinsic::vp_fpext: {
+    std::optional<unsigned> FOp =
+        VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
+    if (FOp && !ICA.getArgTypes().empty())
+      return getCastInstrCost(*FOp, ICA.getReturnType(), ICA.getArgTypes()[0], TTI::CastContextHint::None, CostKind);
+    break;
+  }
   }
 
   if (ST->hasVInstructions() && RetTy->isVectorTy()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
index 87ffb23dcb88e..2dc1efc7b5265 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
@@ -19,6 +19,609 @@ define void @unsupported_fp_ops(<vscale x 4 x float> %vec, i32 %extraarg) {
   ret void
 }
 
+define void @int_truncate() {
+; CHECK-LABEL: 'int_truncate'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_truncate'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  trunc i32 undef to i8
+  trunc <1 x i32> undef to <1 x i8>
+  trunc <2 x i32> undef to <2 x i8>
+  trunc <4 x i32> undef to <4 x i8>
+  trunc <8 x i32> undef to <8 x i8>
+  trunc <16 x i32> undef to <16 x i8>
+  trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+  trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+  trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+  trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+  call <1 x i8> @llvm.vp.trunc.v1i32.v1i8(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i8> @llvm.vp.trunc.v2i32.v2i8(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i8> @llvm.vp.trunc.v4i32.v4i8(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i8> @llvm.vp.trunc.v8i32.v8i8(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i8> @llvm.vp.trunc.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i32.nxv1i8(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i32.nxv2i8(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i32.nxv4i8(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i32.nxv8i8(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_zext() {
+; CHECK-LABEL: 'int_zext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_zext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  zext i32 undef to i64
+  zext <1 x i32> undef to <1 x i64>
+  zext <2 x i32> undef to <2 x i64>
+  zext <4 x i32> undef to <4 x i64>
+  zext <8 x i32> undef to <8 x i64>
+  zext <16 x i32> undef to <16 x i64>
+  zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+  zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+  zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+  zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+  call <1 x i64> @llvm.vp.zext.v1i32.v1i64(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i64> @llvm.vp.zext.v2i32.v2i64(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i64> @llvm.vp.zext.v4i32.v4i64(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i64> @llvm.vp.zext.v8i32.v8i64(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i64> @llvm.vp.zext.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i64> @llvm.vp.zext.nxv1i32.nxv1i64(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i64> @llvm.vp.zext.nxv2i32.nxv2i64(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i64> @llvm.vp.zext.nxv4i32.nxv4i64(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i64> @llvm.vp.zext.nxv8i32.nxv8i64(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_sext() {
+; CHECK-LABEL: 'int_sext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.sext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.sext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.sext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.sext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.sext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.sext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.sext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_sext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x ...
[truncated]

Copy link

github-actions bot commented Jul 5, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@ElvisWang123
Copy link
Contributor Author

Gentle ping

@ElvisWang123
Copy link
Contributor Author

@lukel97 Gentle Ping.

@ElvisWang123 ElvisWang123 force-pushed the add-vp-cast-cost branch 2 times, most recently from 345a89a to a5c23c8 Compare September 2, 2024 03:11
@ElvisWang123
Copy link
Contributor Author

@lukel97 Gentle ping!

Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a small nit about the run lines

@ElvisWang123 ElvisWang123 merged commit 845d8d9 into llvm:main Sep 5, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants