-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV][TTI] Add cost of typebased cast VPIntrinsics with functionalOPC. #97797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Elvis Wang (ElvisWang123) ChangesThis patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart. Patch is 67.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97797.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d603138773de4..63a554f4740dc 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -985,6 +985,23 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
return getArithmeticInstrCost(*FOp, ICA.getReturnType(), CostKind);
break;
}
+ // vp int cast ops.
+ case Intrinsic::vp_trunc:
+ case Intrinsic::vp_zext:
+ case Intrinsic::vp_sext:
+ // vp float cast ops.
+ case Intrinsic::vp_fptoui:
+ case Intrinsic::vp_fptosi:
+ case Intrinsic::vp_uitofp:
+ case Intrinsic::vp_sitofp:
+ case Intrinsic::vp_fptrunc:
+ case Intrinsic::vp_fpext: {
+ std::optional<unsigned> FOp =
+ VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
+ if (FOp && !ICA.getArgTypes().empty())
+ return getCastInstrCost(*FOp, ICA.getReturnType(), ICA.getArgTypes()[0], TTI::CastContextHint::None, CostKind);
+ break;
+ }
}
if (ST->hasVInstructions() && RetTy->isVectorTy()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
index 87ffb23dcb88e..2dc1efc7b5265 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
@@ -19,6 +19,609 @@ define void @unsupported_fp_ops(<vscale x 4 x float> %vec, i32 %extraarg) {
ret void
}
+define void @int_truncate() {
+; CHECK-LABEL: 'int_truncate'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_truncate'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ trunc i32 undef to i8
+ trunc <1 x i32> undef to <1 x i8>
+ trunc <2 x i32> undef to <2 x i8>
+ trunc <4 x i32> undef to <4 x i8>
+ trunc <8 x i32> undef to <8 x i8>
+ trunc <16 x i32> undef to <16 x i8>
+ trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+ trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+ trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+ trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+ call <1 x i8> @llvm.vp.trunc.v1i32.v1i8(<1 x i32> undef, <1 x i1> undef, i32 undef)
+ call <2 x i8> @llvm.vp.trunc.v2i32.v2i8(<2 x i32> undef, <2 x i1> undef, i32 undef)
+ call <4 x i8> @llvm.vp.trunc.v4i32.v4i8(<4 x i32> undef, <4 x i1> undef, i32 undef)
+ call <8 x i8> @llvm.vp.trunc.v8i32.v8i8(<8 x i32> undef, <8 x i1> undef, i32 undef)
+ call <16 x i8> @llvm.vp.trunc.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+ call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i32.nxv1i8(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+ call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i32.nxv2i8(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i32.nxv4i8(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i32.nxv8i8(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+ ret void
+}
+
+define void @int_zext() {
+; CHECK-LABEL: 'int_zext'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_zext'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ zext i32 undef to i64
+ zext <1 x i32> undef to <1 x i64>
+ zext <2 x i32> undef to <2 x i64>
+ zext <4 x i32> undef to <4 x i64>
+ zext <8 x i32> undef to <8 x i64>
+ zext <16 x i32> undef to <16 x i64>
+ zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+ zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+ zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+ zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+ call <1 x i64> @llvm.vp.zext.v1i32.v1i64(<1 x i32> undef, <1 x i1> undef, i32 undef)
+ call <2 x i64> @llvm.vp.zext.v2i32.v2i64(<2 x i32> undef, <2 x i1> undef, i32 undef)
+ call <4 x i64> @llvm.vp.zext.v4i32.v4i64(<4 x i32> undef, <4 x i1> undef, i32 undef)
+ call <8 x i64> @llvm.vp.zext.v8i32.v8i64(<8 x i32> undef, <8 x i1> undef, i32 undef)
+ call <16 x i64> @llvm.vp.zext.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+ call <vscale x 1 x i64> @llvm.vp.zext.nxv1i32.nxv1i64(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+ call <vscale x 2 x i64> @llvm.vp.zext.nxv2i32.nxv2i64(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i64> @llvm.vp.zext.nxv4i32.nxv4i64(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i64> @llvm.vp.zext.nxv8i32.nxv8i64(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+ ret void
+}
+
+define void @int_sext() {
+; CHECK-LABEL: 'int_sext'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.sext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.sext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.sext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.sext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.sext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.sext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.sext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_sext'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x ...
[truncated]
|
@llvm/pr-subscribers-llvm-analysis Author: Elvis Wang (ElvisWang123) ChangesThis patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart. Patch is 67.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97797.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d603138773de4..63a554f4740dc 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -985,6 +985,23 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
return getArithmeticInstrCost(*FOp, ICA.getReturnType(), CostKind);
break;
}
+ // vp int cast ops.
+ case Intrinsic::vp_trunc:
+ case Intrinsic::vp_zext:
+ case Intrinsic::vp_sext:
+ // vp float cast ops.
+ case Intrinsic::vp_fptoui:
+ case Intrinsic::vp_fptosi:
+ case Intrinsic::vp_uitofp:
+ case Intrinsic::vp_sitofp:
+ case Intrinsic::vp_fptrunc:
+ case Intrinsic::vp_fpext: {
+ std::optional<unsigned> FOp =
+ VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
+ if (FOp && !ICA.getArgTypes().empty())
+ return getCastInstrCost(*FOp, ICA.getReturnType(), ICA.getArgTypes()[0], TTI::CastContextHint::None, CostKind);
+ break;
+ }
}
if (ST->hasVInstructions() && RetTy->isVectorTy()) {
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
index 87ffb23dcb88e..2dc1efc7b5265 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
@@ -19,6 +19,609 @@ define void @unsupported_fp_ops(<vscale x 4 x float> %vec, i32 %extraarg) {
ret void
}
+define void @int_truncate() {
+; CHECK-LABEL: 'int_truncate'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_truncate'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %1 = trunc i32 undef to i8
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = trunc <1 x i32> undef to <1 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %3 = trunc <2 x i32> undef to <2 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = trunc <4 x i32> undef to <4 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = trunc <8 x i32> undef to <8 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %6 = trunc <16 x i32> undef to <16 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %7 = trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %9 = trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %10 = trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %11 = call <1 x i8> @llvm.vp.trunc.v1i8.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i8> @llvm.vp.trunc.v2i8.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i8> @llvm.vp.trunc.v4i8.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i8> @llvm.vp.trunc.v8i8.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %15 = call <16 x i8> @llvm.vp.trunc.v16i8.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i8.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i8.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i8.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %19 = call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i8.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ trunc i32 undef to i8
+ trunc <1 x i32> undef to <1 x i8>
+ trunc <2 x i32> undef to <2 x i8>
+ trunc <4 x i32> undef to <4 x i8>
+ trunc <8 x i32> undef to <8 x i8>
+ trunc <16 x i32> undef to <16 x i8>
+ trunc <vscale x 1 x i32> undef to <vscale x 1 x i8>
+ trunc <vscale x 2 x i32> undef to <vscale x 2 x i8>
+ trunc <vscale x 4 x i32> undef to <vscale x 4 x i8>
+ trunc <vscale x 8 x i32> undef to <vscale x 8 x i8>
+ call <1 x i8> @llvm.vp.trunc.v1i32.v1i8(<1 x i32> undef, <1 x i1> undef, i32 undef)
+ call <2 x i8> @llvm.vp.trunc.v2i32.v2i8(<2 x i32> undef, <2 x i1> undef, i32 undef)
+ call <4 x i8> @llvm.vp.trunc.v4i32.v4i8(<4 x i32> undef, <4 x i1> undef, i32 undef)
+ call <8 x i8> @llvm.vp.trunc.v8i32.v8i8(<8 x i32> undef, <8 x i1> undef, i32 undef)
+ call <16 x i8> @llvm.vp.trunc.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+ call <vscale x 1 x i8> @llvm.vp.trunc.nxv1i32.nxv1i8(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+ call <vscale x 2 x i8> @llvm.vp.trunc.nxv2i32.nxv2i8(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i8> @llvm.vp.trunc.nxv4i32.nxv4i8(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i8> @llvm.vp.trunc.nxv8i32.nxv8i8(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+ ret void
+}
+
+define void @int_zext() {
+; CHECK-LABEL: 'int_zext'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_zext'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = zext i32 undef to i64
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = zext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = zext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = zext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = zext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = zext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.zext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.zext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.zext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.zext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.zext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.zext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.zext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.zext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ zext i32 undef to i64
+ zext <1 x i32> undef to <1 x i64>
+ zext <2 x i32> undef to <2 x i64>
+ zext <4 x i32> undef to <4 x i64>
+ zext <8 x i32> undef to <8 x i64>
+ zext <16 x i32> undef to <16 x i64>
+ zext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+ zext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+ zext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+ zext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+ call <1 x i64> @llvm.vp.zext.v1i32.v1i64(<1 x i32> undef, <1 x i1> undef, i32 undef)
+ call <2 x i64> @llvm.vp.zext.v2i32.v2i64(<2 x i32> undef, <2 x i1> undef, i32 undef)
+ call <4 x i64> @llvm.vp.zext.v4i32.v4i64(<4 x i32> undef, <4 x i1> undef, i32 undef)
+ call <8 x i64> @llvm.vp.zext.v8i32.v8i64(<8 x i32> undef, <8 x i1> undef, i32 undef)
+ call <16 x i64> @llvm.vp.zext.v16.v16(<16 x i32> undef, <16 x i1> undef, i32 undef)
+ call <vscale x 1 x i64> @llvm.vp.zext.nxv1i32.nxv1i64(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+ call <vscale x 2 x i64> @llvm.vp.zext.nxv2i32.nxv2i64(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i64> @llvm.vp.zext.nxv4i32.nxv4i64(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i64> @llvm.vp.zext.nxv8i32.nxv8i64(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+ ret void
+}
+
+define void @int_sext() {
+; CHECK-LABEL: 'int_sext'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i64> @llvm.vp.sext.v2i64.v2i32(<2 x i32> undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i64> @llvm.vp.sext.v4i64.v4i32(<4 x i32> undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %14 = call <8 x i64> @llvm.vp.sext.v8i64.v8i32(<8 x i32> undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %15 = call <16 x i64> @llvm.vp.sext.v16i64.v16i32(<16 x i32> undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i64> @llvm.vp.sext.nxv1i64.nxv1i32(<vscale x 1 x i32> undef, <vscale x 1 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i64> @llvm.vp.sext.nxv2i64.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %18 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i32(<vscale x 4 x i32> undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %19 = call <vscale x 8 x i64> @llvm.vp.sext.nxv8i64.nxv8i32(<vscale x 8 x i32> undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'int_sext'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = sext i32 undef to i64
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = sext <1 x i32> undef to <1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = sext <2 x i32> undef to <2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = sext <4 x i32> undef to <4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %5 = sext <8 x i32> undef to <8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %6 = sext <16 x i32> undef to <16 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = sext <vscale x 1 x i32> undef to <vscale x 1 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = sext <vscale x 2 x i32> undef to <vscale x 2 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %9 = sext <vscale x 4 x i32> undef to <vscale x 4 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %10 = sext <vscale x 8 x i32> undef to <vscale x 8 x i64>
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <1 x i64> @llvm.vp.sext.v1i64.v1i32(<1 x i32> undef, <1 x ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
f39f4fe
to
21558b7
Compare
Gentle ping |
21558b7
to
0718bac
Compare
@lukel97 Gentle Ping. |
345a89a
to
a5c23c8
Compare
This patch make the instrudction cost of typebased cast VP intrinsics that will be same as their non-VP counterpart.
a5c23c8
to
411b6a4
Compare
411b6a4
to
6eb4326
Compare
@lukel97 Gentle ping! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a small nit about the run lines
This patch make the instruction cost of type-based cast VP intrinsics will be same as their non-VP counterpart.
This is the following patch of #93435