-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV][CostModel] cost of vector cttz/ctlz under ZVBB #115800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-analysis Author: LiqinWeng (LiqinWeng) ChangesPatch is 35.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115800.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 046f63fe1617c5..9fa6624aae89f8 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1035,6 +1035,8 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
return LT.first;
break;
}
+ case Intrinsic::cttz:
+ case Intrinsic::ctlz:
case Intrinsic::ctpop: {
auto LT = getTypeLegalizationCost(RetTy);
if (ST->hasVInstructions() && ST->hasStdExtZvbb() && LT.second.isVector())
diff --git a/llvm/test/Analysis/CostModel/RISCV/int-bit-manip.ll b/llvm/test/Analysis/CostModel/RISCV/int-bit-manip.ll
index ea05464b084086..911993caba4d65 100644
--- a/llvm/test/Analysis/CostModel/RISCV/int-bit-manip.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/int-bit-manip.ll
@@ -157,6 +157,260 @@ define void @bitreverse() {
ret void
}
+define void @ctlz() {
+; NOZVBB-LABEL: 'ctlz'
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %1 = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %2 = call <4 x i8> @llvm.ctlz.v4i8(<4 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %3 = call <8 x i8> @llvm.ctlz.v8i8(<8 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %4 = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %5 = call <vscale x 1 x i8> @llvm.ctlz.nxv1i8(<vscale x 1 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %6 = call <vscale x 2 x i8> @llvm.ctlz.nxv2i8(<vscale x 2 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %7 = call <vscale x 4 x i8> @llvm.ctlz.nxv4i8(<vscale x 4 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = call <vscale x 8 x i8> @llvm.ctlz.nxv8i8(<vscale x 8 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %9 = call <vscale x 16 x i8> @llvm.ctlz.nxv16i8(<vscale x 16 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %10 = call <vscale x 32 x i8> @llvm.ctlz.nxv32i8(<vscale x 32 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %11 = call <vscale x 64 x i8> @llvm.ctlz.nxv64i8(<vscale x 64 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %12 = call <2 x i16> @llvm.ctlz.v2i16(<2 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %13 = call <4 x i16> @llvm.ctlz.v4i16(<4 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %15 = call <16 x i16> @llvm.ctlz.v16i16(<16 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %16 = call <vscale x 1 x i16> @llvm.ctlz.nxv1i16(<vscale x 1 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %17 = call <vscale x 2 x i16> @llvm.ctlz.nxv2i16(<vscale x 2 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %18 = call <vscale x 4 x i16> @llvm.ctlz.nxv4i16(<vscale x 4 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %19 = call <vscale x 8 x i16> @llvm.ctlz.nxv8i16(<vscale x 8 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %20 = call <vscale x 16 x i16> @llvm.ctlz.nxv16i16(<vscale x 16 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %21 = call <vscale x 32 x i16> @llvm.ctlz.nxv32i16(<vscale x 32 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %22 = call <2 x i32> @llvm.ctlz.v2i32(<2 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %23 = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %24 = call <8 x i32> @llvm.ctlz.v8i32(<8 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %25 = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %26 = call <vscale x 1 x i32> @llvm.ctlz.nxv1i32(<vscale x 1 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %27 = call <vscale x 2 x i32> @llvm.ctlz.nxv2i32(<vscale x 2 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %28 = call <vscale x 4 x i32> @llvm.ctlz.nxv4i32(<vscale x 4 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %29 = call <vscale x 8 x i32> @llvm.ctlz.nxv8i32(<vscale x 8 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %30 = call <vscale x 16 x i32> @llvm.ctlz.nxv16i32(<vscale x 16 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %31 = call <2 x i64> @llvm.ctlz.v2i64(<2 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %32 = call <4 x i64> @llvm.ctlz.v4i64(<4 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %33 = call <8 x i64> @llvm.ctlz.v8i64(<8 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %34 = call <16 x i64> @llvm.ctlz.v16i64(<16 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %35 = call <vscale x 1 x i64> @llvm.ctlz.nxv1i64(<vscale x 1 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %36 = call <vscale x 2 x i64> @llvm.ctlz.nxv2i64(<vscale x 2 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %37 = call <vscale x 4 x i64> @llvm.ctlz.nxv4i64(<vscale x 4 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %38 = call <vscale x 8 x i64> @llvm.ctlz.nxv8i64(<vscale x 8 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %39 = call <vscale x 16 x i64> @llvm.ctlz.nxv16i64(<vscale x 16 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; ZVBB-LABEL: 'ctlz'
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = call <4 x i8> @llvm.ctlz.v4i8(<4 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = call <8 x i8> @llvm.ctlz.v8i8(<8 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %5 = call <vscale x 1 x i8> @llvm.ctlz.nxv1i8(<vscale x 1 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %6 = call <vscale x 2 x i8> @llvm.ctlz.nxv2i8(<vscale x 2 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = call <vscale x 4 x i8> @llvm.ctlz.nxv4i8(<vscale x 4 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %8 = call <vscale x 8 x i8> @llvm.ctlz.nxv8i8(<vscale x 8 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %9 = call <vscale x 16 x i8> @llvm.ctlz.nxv16i8(<vscale x 16 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %10 = call <vscale x 32 x i8> @llvm.ctlz.nxv32i8(<vscale x 32 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = call <vscale x 64 x i8> @llvm.ctlz.nxv64i8(<vscale x 64 x i8> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = call <2 x i16> @llvm.ctlz.v2i16(<2 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %13 = call <4 x i16> @llvm.ctlz.v4i16(<4 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %14 = call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %15 = call <16 x i16> @llvm.ctlz.v16i16(<16 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %16 = call <vscale x 1 x i16> @llvm.ctlz.nxv1i16(<vscale x 1 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %17 = call <vscale x 2 x i16> @llvm.ctlz.nxv2i16(<vscale x 2 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %18 = call <vscale x 4 x i16> @llvm.ctlz.nxv4i16(<vscale x 4 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %19 = call <vscale x 8 x i16> @llvm.ctlz.nxv8i16(<vscale x 8 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %20 = call <vscale x 16 x i16> @llvm.ctlz.nxv16i16(<vscale x 16 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %21 = call <vscale x 32 x i16> @llvm.ctlz.nxv32i16(<vscale x 32 x i16> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %22 = call <2 x i32> @llvm.ctlz.v2i32(<2 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %23 = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %24 = call <8 x i32> @llvm.ctlz.v8i32(<8 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %25 = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %26 = call <vscale x 1 x i32> @llvm.ctlz.nxv1i32(<vscale x 1 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %27 = call <vscale x 2 x i32> @llvm.ctlz.nxv2i32(<vscale x 2 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %28 = call <vscale x 4 x i32> @llvm.ctlz.nxv4i32(<vscale x 4 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %29 = call <vscale x 8 x i32> @llvm.ctlz.nxv8i32(<vscale x 8 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %30 = call <vscale x 16 x i32> @llvm.ctlz.nxv16i32(<vscale x 16 x i32> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %31 = call <2 x i64> @llvm.ctlz.v2i64(<2 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %32 = call <4 x i64> @llvm.ctlz.v4i64(<4 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %33 = call <8 x i64> @llvm.ctlz.v8i64(<8 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %34 = call <16 x i64> @llvm.ctlz.v16i64(<16 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %35 = call <vscale x 1 x i64> @llvm.ctlz.nxv1i64(<vscale x 1 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %36 = call <vscale x 2 x i64> @llvm.ctlz.nxv2i64(<vscale x 2 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %37 = call <vscale x 4 x i64> @llvm.ctlz.nxv4i64(<vscale x 4 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %38 = call <vscale x 8 x i64> @llvm.ctlz.nxv8i64(<vscale x 8 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %39 = call <vscale x 16 x i64> @llvm.ctlz.nxv16i64(<vscale x 16 x i64> undef, i1 false)
+; ZVBB-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> undef, i1 false)
+ call <4 x i8> @llvm.ctlz.v4i8(<4 x i8> undef, i1 false)
+ call <8 x i8> @llvm.ctlz.v8i8(<8 x i8> undef, i1 false)
+ call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> undef, i1 false)
+ call <vscale x 1 x i8> @llvm.ctlz.nxv1i8(<vscale x 1 x i8> undef, i1 false)
+ call <vscale x 2 x i8> @llvm.ctlz.nxv2i8(<vscale x 2 x i8> undef, i1 false)
+ call <vscale x 4 x i8> @llvm.ctlz.nxv4i8(<vscale x 4 x i8> undef, i1 false)
+ call <vscale x 8 x i8> @llvm.ctlz.nxv8i8(<vscale x 8 x i8> undef, i1 false)
+ call <vscale x 16 x i8> @llvm.ctlz.nxv16i8(<vscale x 16 x i8> undef, i1 false)
+ call <vscale x 32 x i8> @llvm.ctlz.nxv32i8(<vscale x 32 x i8> undef, i1 false)
+ call <vscale x 64 x i8> @llvm.ctlz.nxv64i8(<vscale x 64 x i8> undef, i1 false)
+ call <2 x i16> @llvm.ctlz.v2i16(<2 x i16> undef, i1 false)
+ call <4 x i16> @llvm.ctlz.v4i16(<4 x i16> undef, i1 false)
+ call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> undef, i1 false)
+ call <16 x i16> @llvm.ctlz.v16i16(<16 x i16> undef, i1 false)
+ call <vscale x 1 x i16> @llvm.ctlz.nxv1i16(<vscale x 1 x i16> undef, i1 false)
+ call <vscale x 2 x i16> @llvm.ctlz.nxv2i16(<vscale x 2 x i16> undef, i1 false)
+ call <vscale x 4 x i16> @llvm.ctlz.nxv4i16(<vscale x 4 x i16> undef, i1 false)
+ call <vscale x 8 x i16> @llvm.ctlz.nxv8i16(<vscale x 8 x i16> undef, i1 false)
+ call <vscale x 16 x i16> @llvm.ctlz.nxv16i16(<vscale x 16 x i16> undef, i1 false)
+ call <vscale x 32 x i16> @llvm.ctlz.nxv32i16(<vscale x 32 x i16> undef, i1 false)
+ call <2 x i32> @llvm.ctlz.v2i32(<2 x i32> undef, i1 false)
+ call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> undef, i1 false)
+ call <8 x i32> @llvm.ctlz.v8i32(<8 x i32> undef, i1 false)
+ call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> undef, i1 false)
+ call <vscale x 1 x i32> @llvm.ctlz.nxv1i32(<vscale x 1 x i32> undef, i1 false)
+ call <vscale x 2 x i32> @llvm.ctlz.nxv2i32(<vscale x 2 x i32> undef, i1 false)
+ call <vscale x 4 x i32> @llvm.ctlz.nxv4i32(<vscale x 4 x i32> undef, i1 false)
+ call <vscale x 8 x i32> @llvm.ctlz.nxv8i32(<vscale x 8 x i32> undef, i1 false)
+ call <vscale x 16 x i32> @llvm.ctlz.nxv16i32(<vscale x 16 x i32> undef, i1 false)
+ call <2 x i64> @llvm.ctlz.v2i64(<2 x i64> undef, i1 false)
+ call <4 x i64> @llvm.ctlz.v4i64(<4 x i64> undef, i1 false)
+ call <8 x i64> @llvm.ctlz.v8i64(<8 x i64> undef, i1 false)
+ call <16 x i64> @llvm.ctlz.v16i64(<16 x i64> undef, i1 false)
+ call <vscale x 1 x i64> @llvm.ctlz.nxv1i64(<vscale x 1 x i64> undef, i1 false)
+ call <vscale x 2 x i64> @llvm.ctlz.nxv2i64(<vscale x 2 x i64> undef, i1 false)
+ call <vscale x 4 x i64> @llvm.ctlz.nxv4i64(<vscale x 4 x i64> undef, i1 false)
+ call <vscale x 8 x i64> @llvm.ctlz.nxv8i64(<vscale x 8 x i64> undef, i1 false)
+ call <vscale x 16 x i64> @llvm.ctlz.nxv16i64(<vscale x 16 x i64> undef, i1 false)
+ ret void
+}
+
+define void @cttz() {
+; NOZVBB-LABEL: 'cttz'
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %1 = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 44 for instruction: %2 = call <4 x i8> @llvm.cttz.v4i8(<4 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 88 for instruction: %3 = call <8 x i8> @llvm.cttz.v8i8(<8 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 176 for instruction: %4 = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %5 = call <vscale x 1 x i8> @llvm.cttz.nxv1i8(<vscale x 1 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %6 = call <vscale x 2 x i8> @llvm.cttz.nxv2i8(<vscale x 2 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %7 = call <vscale x 4 x i8> @llvm.cttz.nxv4i8(<vscale x 4 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %8 = call <vscale x 8 x i8> @llvm.cttz.nxv8i8(<vscale x 8 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %9 = call <vscale x 16 x i8> @llvm.cttz.nxv16i8(<vscale x 16 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %10 = call <vscale x 32 x i8> @llvm.cttz.nxv32i8(<vscale x 32 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %11 = call <vscale x 64 x i8> @llvm.cttz.nxv64i8(<vscale x 64 x i8> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %12 = call <2 x i16> @llvm.cttz.v2i16(<2 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 44 for instruction: %13 = call <4 x i16> @llvm.cttz.v4i16(<4 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 88 for instruction: %14 = call <8 x i16> @llvm.cttz.v8i16(<8 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 191 for instruction: %15 = call <16 x i16> @llvm.cttz.v16i16(<16 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %16 = call <vscale x 1 x i16> @llvm.cttz.nxv1i16(<vscale x 1 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %17 = call <vscale x 2 x i16> @llvm.cttz.nxv2i16(<vscale x 2 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %18 = call <vscale x 4 x i16> @llvm.cttz.nxv4i16(<vscale x 4 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %19 = call <vscale x 8 x i16> @llvm.cttz.nxv8i16(<vscale x 8 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %20 = call <vscale x 16 x i16> @llvm.cttz.nxv16i16(<vscale x 16 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %21 = call <vscale x 32 x i16> @llvm.cttz.nxv32i16(<vscale x 32 x i16> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %22 = call <2 x i32> @llvm.cttz.v2i32(<2 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 44 for instruction: %23 = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 95 for instruction: %24 = call <8 x i32> @llvm.cttz.v8i32(<8 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 191 for instruction: %25 = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %26 = call <vscale x 1 x i32> @llvm.cttz.nxv1i32(<vscale x 1 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %27 = call <vscale x 2 x i32> @llvm.cttz.nxv2i32(<vscale x 2 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %28 = call <vscale x 4 x i32> @llvm.cttz.nxv4i32(<vscale x 4 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %29 = call <vscale x 8 x i32> @llvm.cttz.nxv8i32(<vscale x 8 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Invalid cost for instruction: %30 = call <vscale x 16 x i32> @llvm.cttz.nxv16i32(<vscale x 16 x i32> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %31 = call <2 x i64> @llvm.cttz.v2i64(<2 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 47 for instruction: %32 = call <4 x i64> @llvm.cttz.v4i64(<4 x i64> undef, i1 false)
+; NOZVBB-NEXT: Cost Model: Found an estimated cost of 95 for instruction: %33 = call...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %35 = call <vscale x 1 x i64> @llvm.ctlz.nxv1i64(<vscale x 1 x i64> undef, i1 false) | ||
; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %36 = call <vscale x 2 x i64> @llvm.ctlz.nxv2i64(<vscale x 2 x i64> undef, i1 false) | ||
; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %37 = call <vscale x 4 x i64> @llvm.ctlz.nxv4i64(<vscale x 4 x i64> undef, i1 false) | ||
; ZVBB-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %38 = call <vscale x 8 x i64> @llvm.ctlz.nxv8i64(<vscale x 8 x i64> undef, i1 false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we still need to scale the cost by LMUL, for both the zvbb and non-zvbb costs. This might be worth doing in a follow up PR
No description provided.