[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797

ElvisWang123 · 2024-07-05T08:07:55Z

This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435

llvmbot · 2024-07-05T08:08:28Z

@llvm/pr-subscribers-backend-risc-v

Author: Elvis Wang (ElvisWang123)

Changes

This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435

Patch is 67.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97797.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+17)
(modified) llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll (+603)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d603138773de4..63a554f4740dc 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -985,6 +985,23 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
       return getArithmeticInstrCost(*FOp, ICA.getReturnType(), CostKind);
     break;
   }
+  // vp int cast ops.
+  case Intrinsic::vp_trunc:
+  case Intrinsic::vp_zext:
+  case Intrinsic::vp_sext:
+  // vp float cast ops.
+  case Intrinsic::vp_fptoui:
+  case Intrinsic::vp_fptosi:
+  case Intrinsic::vp_uitofp:
+  case Intrinsic::vp_sitofp:
+  case Intrinsic::vp_fptrunc:
+  case Intrinsic::vp_fpext: {
+    std::optional<unsigned> FOp =
+        VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
+    if (FOp && !ICA.getArgTypes().empty())
+      return getCastInstrCost(*FOp, ICA.getReturnType(), ICA.getArgTypes()[0], TTI::CastContextHint::None, CostKind);
+    break;
+  }
   }
 
   if (ST->hasVInstructions() && RetTy->isVectorTy()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
index 87ffb23dcb88e..2dc1efc7b5265 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
@@ -19,6 +19,609 @@ define void @unsupported_fp_ops(<vscale x 4 x float> %vec, i32 %extraarg) {
   ret void
 }
 
+define void @int_truncate() {
+; CHECK-LABEL: 'int_truncate'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_truncate'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  trunc i32 undef to i8
+  trunc <1 x i32> undef to <1 x i8>
+  trunc <2 x i32> undef to <2 x i8>
+  trunc <4 x i32> undef to <4 x i8>
+  trunc <8 x i32> undef to <8 x i8>
+  trunc <16 x i32> undef to <16 x i8>
+  trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+  trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+  trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+  trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+  call <1 x i8> @llvm.vp.trunc.v1i32.v1i8(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i8> @llvm.vp.trunc.v2i32.v2i8(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i8> @llvm.vp.trunc.v4i32.v4i8(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i8> @llvm.vp.trunc.v8i32.v8i8(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i8> @llvm.vp.trunc.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i32.nxv1i8(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i32.nxv2i8(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i32.nxv4i8(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i32.nxv8i8(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_zext() {
+; CHECK-LABEL: 'int_zext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_zext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  zext i32 undef to i64
+  zext <1 x i32> undef to <1 x i64>
+  zext <2 x i32> undef to <2 x i64>
+  zext <4 x i32> undef to <4 x i64>
+  zext <8 x i32> undef to <8 x i64>
+  zext <16 x i32> undef to <16 x i64>
+  zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+  zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+  zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+  zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+  call <1 x i64> @llvm.vp.zext.v1i32.v1i64(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i64> @llvm.vp.zext.v2i32.v2i64(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i64> @llvm.vp.zext.v4i32.v4i64(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i64> @llvm.vp.zext.v8i32.v8i64(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i64> @llvm.vp.zext.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i64> @llvm.vp.zext.nxv1i32.nxv1i64(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i64> @llvm.vp.zext.nxv2i32.nxv2i64(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i64> @llvm.vp.zext.nxv4i32.nxv4i64(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i64> @llvm.vp.zext.nxv8i32.nxv8i64(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_sext() {
+; CHECK-LABEL: 'int_sext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.sext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.sext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.sext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.sext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.sext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.sext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.sext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_sext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x ...
[truncated]

llvmbot · 2024-07-05T08:08:28Z

@llvm/pr-subscribers-llvm-analysis

Author: Elvis Wang (ElvisWang123)

Changes

This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435

Patch is 67.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97797.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+17)
(modified) llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll (+603)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d603138773de4..63a554f4740dc 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -985,6 +985,23 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
       return getArithmeticInstrCost(*FOp, ICA.getReturnType(), CostKind);
     break;
   }
+  // vp int cast ops.
+  case Intrinsic::vp_trunc:
+  case Intrinsic::vp_zext:
+  case Intrinsic::vp_sext:
+  // vp float cast ops.
+  case Intrinsic::vp_fptoui:
+  case Intrinsic::vp_fptosi:
+  case Intrinsic::vp_uitofp:
+  case Intrinsic::vp_sitofp:
+  case Intrinsic::vp_fptrunc:
+  case Intrinsic::vp_fpext: {
+    std::optional<unsigned> FOp =
+        VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
+    if (FOp && !ICA.getArgTypes().empty())
+      return getCastInstrCost(*FOp, ICA.getReturnType(), ICA.getArgTypes()[0], TTI::CastContextHint::None, CostKind);
+    break;
+  }
   }
 
   if (ST->hasVInstructions() && RetTy->isVectorTy()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
index 87ffb23dcb88e..2dc1efc7b5265 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
@@ -19,6 +19,609 @@ define void @unsupported_fp_ops(<vscale x 4 x float> %vec, i32 %extraarg) {
   ret void
 }
 
+define void @int_truncate() {
+; CHECK-LABEL: 'int_truncate'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_truncate'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  trunc i32 undef to i8
+  trunc <1 x i32> undef to <1 x i8>
+  trunc <2 x i32> undef to <2 x i8>
+  trunc <4 x i32> undef to <4 x i8>
+  trunc <8 x i32> undef to <8 x i8>
+  trunc <16 x i32> undef to <16 x i8>
+  trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+  trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+  trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+  trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+  call <1 x i8> @llvm.vp.trunc.v1i32.v1i8(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i8> @llvm.vp.trunc.v2i32.v2i8(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i8> @llvm.vp.trunc.v4i32.v4i8(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i8> @llvm.vp.trunc.v8i32.v8i8(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i8> @llvm.vp.trunc.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i32.nxv1i8(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i32.nxv2i8(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i32.nxv4i8(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i32.nxv8i8(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_zext() {
+; CHECK-LABEL: 'int_zext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_zext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  zext i32 undef to i64
+  zext <1 x i32> undef to <1 x i64>
+  zext <2 x i32> undef to <2 x i64>
+  zext <4 x i32> undef to <4 x i64>
+  zext <8 x i32> undef to <8 x i64>
+  zext <16 x i32> undef to <16 x i64>
+  zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+  zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+  zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+  zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+  call <1 x i64> @llvm.vp.zext.v1i32.v1i64(<1 x i32> undef, <1 x i1> undef, i32 undef)
+  call <2 x i64> @llvm.vp.zext.v2i32.v2i64(<2 x i32> undef, <2 x i1> undef, i32 undef)
+  call <4 x i64> @llvm.vp.zext.v4i32.v4i64(<4 x i32> undef, <4 x i1> undef, i32 undef)
+  call <8 x i64> @llvm.vp.zext.v8i32.v8i64(<8 x i32> undef, <8 x i1> undef, i32 undef)
+  call <16 x i64> @llvm.vp.zext.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+  call <vscale x 1 x i64> @llvm.vp.zext.nxv1i32.nxv1i64(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+  call <vscale x 2 x i64> @llvm.vp.zext.nxv2i32.nxv2i64(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+  call <vscale x 4 x i64> @llvm.vp.zext.nxv4i32.nxv4i64(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+  call <vscale x 8 x i64> @llvm.vp.zext.nxv8i32.nxv8i64(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+  ret void
+}
+
+define void @int_sext() {
+; CHECK-LABEL: 'int_sext'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.sext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.sext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.sext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.sext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.sext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.sext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.sext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_sext'
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x ...
[truncated]

github-actions · 2024-07-05T08:12:05Z

✅ With the latest revision this PR passed the C/C++ code formatter.

ElvisWang123 · 2024-07-25T00:21:23Z

Gentle ping

llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

ElvisWang123 · 2024-08-29T01:39:10Z

@lukel97 Gentle Ping.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

llvm/test/Analysis/CostModel/RISCV/vp-cast.ll

This patch make the instrudction cost of typebased cast VP intrinsics that will be same as their non-VP counterpart.

ElvisWang123 · 2024-09-04T01:47:31Z

@lukel97 Gentle ping!

lukel97

LGTM, just a small nit about the run lines

llvm/test/Analysis/CostModel/RISCV/cast.ll

ElvisWang123 requested review from arcbbb, preames and lukel97 July 5, 2024 08:08

llvmbot added backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding labels Jul 5, 2024

ElvisWang123 force-pushed the add-vp-cast-cost branch from f39f4fe to 21558b7 Compare July 24, 2024 08:19

lukel97 reviewed Jul 25, 2024

View reviewed changes

llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll Outdated Show resolved Hide resolved

ElvisWang123 force-pushed the add-vp-cast-cost branch from 21558b7 to 0718bac Compare July 30, 2024 07:14

arcbbb reviewed Aug 6, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp Outdated Show resolved Hide resolved

lukel97 reviewed Aug 29, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp Show resolved Hide resolved

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp Outdated Show resolved Hide resolved

llvm/test/Analysis/CostModel/RISCV/vp-cast.ll Outdated Show resolved Hide resolved

ElvisWang123 force-pushed the add-vp-cast-cost branch 2 times, most recently from 345a89a to a5c23c8 Compare September 2, 2024 03:11

ElvisWang123 added 4 commits September 2, 2024 17:53

Precommit vp-cast intrinsic cost testcases

52423dd

[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC.

9f4b8ec

This patch make the instrudction cost of typebased cast VP intrinsics that will be same as their non-VP counterpart.

Address comment and migrate testcases to vp-cast

da426ed

Add assertions

6418d85

ElvisWang123 force-pushed the add-vp-cast-cost branch from a5c23c8 to 411b6a4 Compare September 3, 2024 01:24

Merge vp.cast tests into cast.ll

6eb4326

ElvisWang123 force-pushed the add-vp-cast-cost branch from 411b6a4 to 6eb4326 Compare September 3, 2024 06:38

lukel97 approved these changes Sep 4, 2024

View reviewed changes

llvm/test/Analysis/CostModel/RISCV/cast.ll Show resolved Hide resolved

Merge Runline

9709df5

ElvisWang123 merged commit 845d8d9 into llvm:main Sep 5, 2024
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797

[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797

Uh oh!

ElvisWang123 commented Jul 5, 2024

Uh oh!

llvmbot commented Jul 5, 2024

Uh oh!

llvmbot commented Jul 5, 2024

Uh oh!

github-actions bot commented Jul 5, 2024 •

edited

Loading

Uh oh!

ElvisWang123 commented Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

ElvisWang123 commented Aug 29, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ElvisWang123 commented Sep 4, 2024

Uh oh!

lukel97 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797

[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797

Uh oh!

Conversation

ElvisWang123 commented Jul 5, 2024

Uh oh!

llvmbot commented Jul 5, 2024

Uh oh!

llvmbot commented Jul 5, 2024

Uh oh!

github-actions bot commented Jul 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ElvisWang123 commented Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

ElvisWang123 commented Aug 29, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ElvisWang123 commented Sep 4, 2024

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 5, 2024 •

edited

Loading