Skip to content

[RISCV][TTI] Add vp.fneg intrinsic cost with functionalOP #114378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 13, 2024

Conversation

LiqinWeng
Copy link
Contributor

No description provided.

@llvmbot llvmbot added backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding labels Oct 31, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 31, 2024

@llvm/pr-subscribers-backend-risc-v

Author: LiqinWeng (LiqinWeng)

Changes

Patch is 301.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114378.diff

9 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+2-1)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll (+83-83)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fp-sqrt-pow.ll (+36-36)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fp-trig-log-exp.ll (+126-126)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fround.ll (+304-304)
  • (modified) llvm/test/Analysis/CostModel/RISCV/int-bit-manip.ll (+284-284)
  • (modified) llvm/test/Analysis/CostModel/RISCV/int-min-max.ll (+152-152)
  • (modified) llvm/test/Analysis/CostModel/RISCV/int-sat-math.ll (+180-180)
  • (modified) llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll (+108)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 395baa5f1aab99..2c8b71c01c2e30 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1111,7 +1111,8 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
   case Intrinsic::vp_fsub:
   case Intrinsic::vp_fmul:
   case Intrinsic::vp_fdiv:
-  case Intrinsic::vp_frem: {
+  case Intrinsic::vp_frem:
+  case Intrinsic::vp_fneg: {
     std::optional<unsigned> FOp =
         VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
     assert(FOp.has_value());
diff --git a/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll b/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll
index 6e4061a42bf9b8..0b2c8da4438da2 100644
--- a/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll
@@ -30,20 +30,20 @@ define void @fabs() {
   call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
   call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
   call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
-  call <vscale x 1 x float> @llvm.fabs.nvx1f32(<vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.fabs.nvx2f32(<vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.fabs.nvx4f32(<vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.fabs.nvx8f32(<vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.fabs.nvx16f32(<vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.fabs.nxv1f32(<vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.fabs.nxv2f32(<vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.fabs.nxv4f32(<vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.fabs.nxv8f32(<vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.fabs.nxv16f32(<vscale x 16 x float> undef)
   call double @llvm.fabs.f64(double undef)
   call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
   call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
   call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)
   call <16 x double> @llvm.fabs.v16f64(<16 x double> undef)
-  call <vscale x 1 x double> @llvm.fabs.nvx1f64(<vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.fabs.nvx2f64(<vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.fabs.nvx4f64(<vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.fabs.nvx8f64(<vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.fabs.nxv1f64(<vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.fabs.nxv2f64(<vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.fabs.nxv4f64(<vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.fabs.nxv8f64(<vscale x 8 x double> undef)
   ret void
 }
 
@@ -65,10 +65,10 @@ define void @fabs_f16() {
   call <4 x half> @llvm.fabs.v4f16(<4 x half> undef)
   call <8 x half> @llvm.fabs.v8f16(<8 x half> undef)
   call <16 x half> @llvm.fabs.v16f16(<16 x half> undef)
-  call <vscale x 2 x half> @llvm.fabs.nvx2f16(<vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.fabs.nvx4f16(<vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.fabs.nvx8f16(<vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.fabs.nvx16f16(<vscale x 16 x half> undef)
+  call <vscale x 2 x half> @llvm.fabs.nxv2f16(<vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.fabs.nxv4f16(<vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.fabs.nxv8f16(<vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.fabs.nxv16f16(<vscale x 16 x half> undef)
   ret void
 }
 
@@ -100,20 +100,20 @@ define void @minnum() {
   call <4 x float> @llvm.minnum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.minnum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.minnum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.minnum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.minnum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.minnum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.minnum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.minnum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.minnum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.minnum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.minnum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.minnum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.minnum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.minnum.f64(double undef, double undef)
   call <2 x double> @llvm.minnum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.minnum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.minnum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.minnum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.minnum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.minnum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.minnum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.minnum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.minnum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.minnum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.minnum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.minnum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -149,11 +149,11 @@ define void @minnum_f16() {
   call <4 x half> @llvm.minnum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.minnum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.minnum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.minnum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.minnum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.minnum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.minnum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.minnum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.minnum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.minnum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.minnum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.minnum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.minnum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -185,20 +185,20 @@ define void @maxnum() {
   call <4 x float> @llvm.maxnum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.maxnum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.maxnum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.maxnum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.maxnum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.maxnum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.maxnum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.maxnum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.maxnum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.maxnum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.maxnum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.maxnum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.maxnum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.maxnum.f64(double undef, double undef)
   call <2 x double> @llvm.maxnum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.maxnum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.maxnum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.maxnum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.maxnum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.maxnum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.maxnum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.maxnum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.maxnum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.maxnum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.maxnum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.maxnum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -234,11 +234,11 @@ define void @maxnum_f16() {
   call <4 x half> @llvm.maxnum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.maxnum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.maxnum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.maxnum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.maxnum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.maxnum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.maxnum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.maxnum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.maxnum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.maxnum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.maxnum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.maxnum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.maxnum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -270,20 +270,20 @@ define void @minimum() {
   call <4 x float> @llvm.minimum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.minimum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.minimum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.minimum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.minimum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.minimum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.minimum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.minimum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.minimum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.minimum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.minimum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.minimum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.minimum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.minimum.f64(double undef, double undef)
   call <2 x double> @llvm.minimum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.minimum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.minimum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.minimum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.minimum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.minimum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.minimum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.minimum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.minimum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.minimum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.minimum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.minimum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -319,11 +319,11 @@ define void @minimum_f16() {
   call <4 x half> @llvm.minimum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.minimum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.minimum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.minimum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.minimum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.minimum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.minimum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.minimum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.minimum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.minimum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.minimum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.minimum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.minimum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -355,20 +355,20 @@ define void @maximum() {
   call <4 x float> @llvm.maximum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.maximum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.maximum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.maximum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.maximum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.maximum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.maximum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.maximum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.maximum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.maximum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.maximum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.maximum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.maximum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.maximum.f64(double undef, double undef)
   call <2 x double> @llvm.maximum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.maximum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.maximum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.maximum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.maximum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.maximum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.maximum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.maximum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.maximum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.maximum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.maximum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.maximum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -404,11 +404,11 @@ define void @maximum_f16() {
   call <4 x half> @llvm.maximum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.maximum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.maximum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.maximum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.maximum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.maximum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.maximum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.maximum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.maximum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.maximum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.maximum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.maximum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.maximum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -440,20 +440,20 @@ define void @copysign() {
   call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.copysign.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.copysign.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.copysign.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.copysign.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.copysign.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.copysign.nxv1f32(<vscale x 1 x float> undef...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 31, 2024

@llvm/pr-subscribers-llvm-analysis

Author: LiqinWeng (LiqinWeng)

Changes

Patch is 301.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114378.diff

9 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+2-1)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll (+83-83)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fp-sqrt-pow.ll (+36-36)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fp-trig-log-exp.ll (+126-126)
  • (modified) llvm/test/Analysis/CostModel/RISCV/fround.ll (+304-304)
  • (modified) llvm/test/Analysis/CostModel/RISCV/int-bit-manip.ll (+284-284)
  • (modified) llvm/test/Analysis/CostModel/RISCV/int-min-max.ll (+152-152)
  • (modified) llvm/test/Analysis/CostModel/RISCV/int-sat-math.ll (+180-180)
  • (modified) llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll (+108)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 395baa5f1aab99..2c8b71c01c2e30 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1111,7 +1111,8 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
   case Intrinsic::vp_fsub:
   case Intrinsic::vp_fmul:
   case Intrinsic::vp_fdiv:
-  case Intrinsic::vp_frem: {
+  case Intrinsic::vp_frem:
+  case Intrinsic::vp_fneg: {
     std::optional<unsigned> FOp =
         VPIntrinsic::getFunctionalOpcodeForVP(ICA.getID());
     assert(FOp.has_value());
diff --git a/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll b/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll
index 6e4061a42bf9b8..0b2c8da4438da2 100644
--- a/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/fp-min-max-abs.ll
@@ -30,20 +30,20 @@ define void @fabs() {
   call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
   call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
   call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
-  call <vscale x 1 x float> @llvm.fabs.nvx1f32(<vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.fabs.nvx2f32(<vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.fabs.nvx4f32(<vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.fabs.nvx8f32(<vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.fabs.nvx16f32(<vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.fabs.nxv1f32(<vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.fabs.nxv2f32(<vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.fabs.nxv4f32(<vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.fabs.nxv8f32(<vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.fabs.nxv16f32(<vscale x 16 x float> undef)
   call double @llvm.fabs.f64(double undef)
   call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
   call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
   call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)
   call <16 x double> @llvm.fabs.v16f64(<16 x double> undef)
-  call <vscale x 1 x double> @llvm.fabs.nvx1f64(<vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.fabs.nvx2f64(<vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.fabs.nvx4f64(<vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.fabs.nvx8f64(<vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.fabs.nxv1f64(<vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.fabs.nxv2f64(<vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.fabs.nxv4f64(<vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.fabs.nxv8f64(<vscale x 8 x double> undef)
   ret void
 }
 
@@ -65,10 +65,10 @@ define void @fabs_f16() {
   call <4 x half> @llvm.fabs.v4f16(<4 x half> undef)
   call <8 x half> @llvm.fabs.v8f16(<8 x half> undef)
   call <16 x half> @llvm.fabs.v16f16(<16 x half> undef)
-  call <vscale x 2 x half> @llvm.fabs.nvx2f16(<vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.fabs.nvx4f16(<vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.fabs.nvx8f16(<vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.fabs.nvx16f16(<vscale x 16 x half> undef)
+  call <vscale x 2 x half> @llvm.fabs.nxv2f16(<vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.fabs.nxv4f16(<vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.fabs.nxv8f16(<vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.fabs.nxv16f16(<vscale x 16 x half> undef)
   ret void
 }
 
@@ -100,20 +100,20 @@ define void @minnum() {
   call <4 x float> @llvm.minnum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.minnum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.minnum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.minnum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.minnum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.minnum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.minnum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.minnum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.minnum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.minnum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.minnum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.minnum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.minnum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.minnum.f64(double undef, double undef)
   call <2 x double> @llvm.minnum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.minnum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.minnum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.minnum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.minnum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.minnum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.minnum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.minnum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.minnum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.minnum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.minnum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.minnum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -149,11 +149,11 @@ define void @minnum_f16() {
   call <4 x half> @llvm.minnum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.minnum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.minnum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.minnum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.minnum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.minnum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.minnum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.minnum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.minnum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.minnum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.minnum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.minnum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.minnum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -185,20 +185,20 @@ define void @maxnum() {
   call <4 x float> @llvm.maxnum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.maxnum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.maxnum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.maxnum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.maxnum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.maxnum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.maxnum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.maxnum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.maxnum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.maxnum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.maxnum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.maxnum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.maxnum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.maxnum.f64(double undef, double undef)
   call <2 x double> @llvm.maxnum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.maxnum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.maxnum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.maxnum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.maxnum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.maxnum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.maxnum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.maxnum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.maxnum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.maxnum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.maxnum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.maxnum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -234,11 +234,11 @@ define void @maxnum_f16() {
   call <4 x half> @llvm.maxnum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.maxnum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.maxnum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.maxnum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.maxnum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.maxnum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.maxnum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.maxnum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.maxnum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.maxnum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.maxnum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.maxnum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.maxnum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -270,20 +270,20 @@ define void @minimum() {
   call <4 x float> @llvm.minimum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.minimum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.minimum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.minimum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.minimum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.minimum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.minimum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.minimum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.minimum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.minimum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.minimum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.minimum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.minimum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.minimum.f64(double undef, double undef)
   call <2 x double> @llvm.minimum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.minimum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.minimum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.minimum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.minimum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.minimum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.minimum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.minimum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.minimum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.minimum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.minimum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.minimum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -319,11 +319,11 @@ define void @minimum_f16() {
   call <4 x half> @llvm.minimum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.minimum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.minimum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.minimum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.minimum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.minimum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.minimum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.minimum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.minimum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.minimum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.minimum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.minimum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.minimum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -355,20 +355,20 @@ define void @maximum() {
   call <4 x float> @llvm.maximum.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.maximum.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.maximum.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.maximum.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.maximum.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.maximum.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.maximum.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.maximum.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.maximum.nxv1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
+  call <vscale x 2 x float> @llvm.maximum.nxv2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
+  call <vscale x 4 x float> @llvm.maximum.nxv4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
+  call <vscale x 8 x float> @llvm.maximum.nxv8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
+  call <vscale x 16 x float> @llvm.maximum.nxv16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
   call double @llvm.maximum.f64(double undef, double undef)
   call <2 x double> @llvm.maximum.v2f64(<2 x double> undef, <2 x double> undef)
   call <4 x double> @llvm.maximum.v4f64(<4 x double> undef, <4 x double> undef)
   call <8 x double> @llvm.maximum.v8f64(<8 x double> undef, <8 x double> undef)
   call <16 x double> @llvm.maximum.v16f64(<16 x double> undef, <16 x double> undef)
-  call <vscale x 1 x double> @llvm.maximum.nvx1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
-  call <vscale x 2 x double> @llvm.maximum.nvx2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
-  call <vscale x 4 x double> @llvm.maximum.nvx4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
-  call <vscale x 8 x double> @llvm.maximum.nvx8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
+  call <vscale x 1 x double> @llvm.maximum.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> undef)
+  call <vscale x 2 x double> @llvm.maximum.nxv2f64(<vscale x 2 x double> undef, <vscale x 2 x double> undef)
+  call <vscale x 4 x double> @llvm.maximum.nxv4f64(<vscale x 4 x double> undef, <vscale x 4 x double> undef)
+  call <vscale x 8 x double> @llvm.maximum.nxv8f64(<vscale x 8 x double> undef, <vscale x 8 x double> undef)
   ret void
 }
 
@@ -404,11 +404,11 @@ define void @maximum_f16() {
   call <4 x half> @llvm.maximum.v4f16(<4 x half> undef, <4 x half> undef)
   call <8 x half> @llvm.maximum.v8f16(<8 x half> undef, <8 x half> undef)
   call <16 x half> @llvm.maximum.v16f16(<16 x half> undef, <16 x half> undef)
-  call <vscale x 1 x half> @llvm.maximum.nvx1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
-  call <vscale x 2 x half> @llvm.maximum.nvx2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
-  call <vscale x 4 x half> @llvm.maximum.nvx4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
-  call <vscale x 8 x half> @llvm.maximum.nvx8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
-  call <vscale x 16 x half> @llvm.maximum.nvx16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
+  call <vscale x 1 x half> @llvm.maximum.nxv1f16(<vscale x 1 x half> undef, <vscale x 1 x half> undef)
+  call <vscale x 2 x half> @llvm.maximum.nxv2f16(<vscale x 2 x half> undef, <vscale x 2 x half> undef)
+  call <vscale x 4 x half> @llvm.maximum.nxv4f16(<vscale x 4 x half> undef, <vscale x 4 x half> undef)
+  call <vscale x 8 x half> @llvm.maximum.nxv8f16(<vscale x 8 x half> undef, <vscale x 8 x half> undef)
+  call <vscale x 16 x half> @llvm.maximum.nxv16f16(<vscale x 16 x half> undef, <vscale x 16 x half> undef)
   ret void
 }
 
@@ -440,20 +440,20 @@ define void @copysign() {
   call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
   call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
   call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
-  call <vscale x 1 x float> @llvm.copysign.nvx1f32(<vscale x 1 x float> undef, <vscale x 1 x float> undef)
-  call <vscale x 2 x float> @llvm.copysign.nvx2f32(<vscale x 2 x float> undef, <vscale x 2 x float> undef)
-  call <vscale x 4 x float> @llvm.copysign.nvx4f32(<vscale x 4 x float> undef, <vscale x 4 x float> undef)
-  call <vscale x 8 x float> @llvm.copysign.nvx8f32(<vscale x 8 x float> undef, <vscale x 8 x float> undef)
-  call <vscale x 16 x float> @llvm.copysign.nvx16f32(<vscale x 16 x float> undef, <vscale x 16 x float> undef)
+  call <vscale x 1 x float> @llvm.copysign.nxv1f32(<vscale x 1 x float> undef...
[truncated]

@@ -1501,6 +1501,113 @@ define void @vp_fadd(){
}


define void @vp_fneg() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to merge test cases into arith-fp.ll which already contains cost model tests for fneg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are still in rvv-intrinsics.ll?

I thought moving these test cases into CostModel/RISCV/arith-fp.ll which already contains some tests for non-vp fneg would better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no problem, can I merge it. Byw,pls review #114516

@LiqinWeng LiqinWeng force-pushed the add-type-base-for-fneg branch from 3c289f3 to db37270 Compare November 1, 2024 05:41
@LiqinWeng LiqinWeng force-pushed the add-type-base-for-fneg branch 2 times, most recently from 35369ad to 81f9ab8 Compare November 4, 2024 05:15
@LiqinWeng LiqinWeng force-pushed the add-type-base-for-fneg branch from 81f9ab8 to 48950b1 Compare November 4, 2024 05:17
Copy link
Contributor

@ElvisWang123 ElvisWang123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LG. But please wait #114184 since the maintainer concern on the test case modification.

@LiqinWeng LiqinWeng merged commit 9aa4f50 into llvm:main Nov 13, 2024
8 checks passed
@LiqinWeng LiqinWeng deleted the add-type-base-for-fneg branch November 14, 2024 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants