Skip to content

[VP][RISCV] Introduce llvm.vp.minimum/maximum intrinsics #74840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20392,6 +20392,106 @@ Examples:
%also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison


.. _int_vp_minimum:

'``llvm.vp.minimum.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x float> @llvm.vp.minimum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x float> @llvm.vp.minimum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x double> @llvm.vp.minimum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated floating-point minimum of two vectors of floating-point values,
propagating NaNs and treating -0.0 as less than +0.0.

Arguments:
""""""""""

The first two operands and the result have the same vector of floating-point type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.minimum``' intrinsic performs floating-point minimum (:ref:`minimum <i_minimum>`)
of the first and second vector operand on each enabled lane, the result being
NaN if either operand is a NaN. -0.0 is considered to be less than +0.0 for this
intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
The operation is performed in the default floating-point environment.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x float> @llvm.vp.minimum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x float> @llvm.minimum.v4f32(<4 x float> %a, <4 x float> %b)
%also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison


.. _int_vp_maximum:

'``llvm.vp.maximum.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
This is an overloaded intrinsic.

::

declare <16 x float> @llvm.vp.maximum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
declare <vscale x 4 x float> @llvm.vp.maximum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
declare <256 x double> @llvm.vp.maximum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)

Overview:
"""""""""

Predicated floating-point maximum of two vectors of floating-point values,
propagating NaNs and treating -0.0 as less than +0.0.

Arguments:
""""""""""

The first two operands and the result have the same vector of floating-point type. The
third operand is the vector mask and has the same number of elements as the
result vector type. The fourth operand is the explicit vector length of the
operation.

Semantics:
""""""""""

The '``llvm.vp.maximum``' intrinsic performs floating-point maximum (:ref:`maximum <i_maximum>`)
of the first and second vector operand on each enabled lane, the result being
NaN if either operand is a NaN. -0.0 is considered to be less than +0.0 for this
intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
The operation is performed in the default floating-point environment.

Examples:
"""""""""

.. code-block:: llvm

%r = call <4 x float> @llvm.vp.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
;; For all lanes below %evl, %r is lane-wise equivalent to %also.r

%t = call <4 x float> @llvm.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
%also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison


.. _int_vp_fadd:

'``llvm.vp.fadd.*``' Intrinsics
Expand Down
10 changes: 10 additions & 0 deletions llvm/include/llvm/IR/Intrinsics.td
Original file line number Diff line number Diff line change
Expand Up @@ -1991,6 +1991,16 @@ let IntrProperties = [IntrNoMem, IntrNoSync, IntrWillReturn] in {
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_minimum : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_maximum : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
llvm_i32_ty]>;
def int_vp_copysign : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
[ LLVMMatchType<0>,
LLVMMatchType<0>,
Expand Down
18 changes: 16 additions & 2 deletions llvm/include/llvm/IR/VPIntrinsics.def
Original file line number Diff line number Diff line change
Expand Up @@ -367,20 +367,34 @@ VP_PROPERTY_FUNCTIONAL_SDOPC(FCOPYSIGN)
VP_PROPERTY_FUNCTIONAL_INTRINSIC(copysign)
END_REGISTER_VP(vp_copysign, VP_FCOPYSIGN)

// llvm.vp.minnum(x, y, mask,vlen)
// llvm.vp.minnum(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_minnum, 2, 3, VP_FMINNUM, -1)
VP_PROPERTY_BINARYOP
VP_PROPERTY_FUNCTIONAL_SDOPC(FMINNUM)
VP_PROPERTY_FUNCTIONAL_INTRINSIC(minnum)
END_REGISTER_VP(vp_minnum, VP_FMINNUM)

// llvm.vp.maxnum(x, y, mask,vlen)
// llvm.vp.maxnum(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_maxnum, 2, 3, VP_FMAXNUM, -1)
VP_PROPERTY_BINARYOP
VP_PROPERTY_FUNCTIONAL_SDOPC(FMAXNUM)
VP_PROPERTY_FUNCTIONAL_INTRINSIC(maxnum)
END_REGISTER_VP(vp_maxnum, VP_FMAXNUM)

// llvm.vp.minimum(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_minimum, 2, 3, VP_FMINIMUM, -1)
VP_PROPERTY_BINARYOP
VP_PROPERTY_FUNCTIONAL_SDOPC(FMINIMUM)
VP_PROPERTY_FUNCTIONAL_INTRINSIC(minimum)
END_REGISTER_VP(vp_minimum, VP_FMINIMUM)

// llvm.vp.maximum(x,y,mask,vlen)
BEGIN_REGISTER_VP(vp_maximum, 2, 3, VP_FMAXIMUM, -1)
VP_PROPERTY_BINARYOP
VP_PROPERTY_FUNCTIONAL_SDOPC(FMAXIMUM)
VP_PROPERTY_FUNCTIONAL_INTRINSIC(maximum)
END_REGISTER_VP(vp_maximum, VP_FMAXIMUM)

// llvm.vp.ceil(x,mask,vlen)
BEGIN_REGISTER_VP(vp_ceil, 1, 2, VP_FCEIL, -1)
VP_PROPERTY_FUNCTIONAL_INTRINSIC(ceil)
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/CodeGen/ExpandVectorPredication.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -729,6 +729,8 @@ Value *CachingVPExpander::expandPredication(VPIntrinsic &VPI) {
case Intrinsic::vp_sqrt:
case Intrinsic::vp_maxnum:
case Intrinsic::vp_minnum:
case Intrinsic::vp_maximum:
case Intrinsic::vp_minimum:
return expandPredicationToFPCall(Builder, VPI,
VPI.getFunctionalIntrinsicID().value());
case Intrinsic::vp_load:
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1143,7 +1143,9 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) {
case ISD::FMINNUM: case ISD::VP_FMINNUM:
case ISD::FMAXNUM: case ISD::VP_FMAXNUM:
case ISD::FMINIMUM:
case ISD::VP_FMINIMUM:
case ISD::FMAXIMUM:
case ISD::VP_FMAXIMUM:
case ISD::SDIV: case ISD::VP_SDIV:
case ISD::UDIV: case ISD::VP_UDIV:
case ISD::FDIV: case ISD::VP_FDIV:
Expand Down Expand Up @@ -4131,7 +4133,9 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
case ISD::FMINNUM: case ISD::VP_FMINNUM:
case ISD::FMAXNUM: case ISD::VP_FMAXNUM:
case ISD::FMINIMUM:
case ISD::VP_FMINIMUM:
case ISD::FMAXIMUM:
case ISD::VP_FMAXIMUM:
case ISD::SMIN: case ISD::VP_SMIN:
case ISD::SMAX: case ISD::VP_SMAX:
case ISD::UMIN: case ISD::VP_UMIN:
Expand Down
28 changes: 24 additions & 4 deletions llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -687,7 +687,8 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_FCEIL, ISD::VP_FFLOOR, ISD::VP_FROUND,
ISD::VP_FROUNDEVEN, ISD::VP_FCOPYSIGN, ISD::VP_FROUNDTOZERO,
ISD::VP_FRINT, ISD::VP_FNEARBYINT, ISD::VP_IS_FPCLASS,
ISD::EXPERIMENTAL_VP_REVERSE, ISD::EXPERIMENTAL_VP_SPLICE};
ISD::VP_FMINIMUM, ISD::VP_FMAXIMUM, ISD::EXPERIMENTAL_VP_REVERSE,
ISD::EXPERIMENTAL_VP_SPLICE};

static const unsigned IntegerVecReduceOps[] = {
ISD::VECREDUCE_ADD, ISD::VECREDUCE_AND, ISD::VECREDUCE_OR,
Expand Down Expand Up @@ -927,7 +928,8 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
ISD::VP_FMINNUM, ISD::VP_FMAXNUM, ISD::VP_FCEIL,
ISD::VP_FFLOOR, ISD::VP_FROUND, ISD::VP_FROUNDEVEN,
ISD::VP_FCOPYSIGN, ISD::VP_FROUNDTOZERO, ISD::VP_FRINT,
ISD::VP_FNEARBYINT, ISD::VP_SETCC};
ISD::VP_FNEARBYINT, ISD::VP_SETCC, ISD::VP_FMINIMUM,
ISD::VP_FMAXIMUM};

// Sets common operation actions on RVV floating-point vector types.
const auto SetCommonVFPActions = [&](MVT VT) {
Expand Down Expand Up @@ -5401,7 +5403,16 @@ static SDValue lowerFMAXIMUM_FMINIMUM(SDValue Op, SelectionDAG &DAG,
Y = convertToScalableVector(ContainerVT, Y, DAG, Subtarget);
}

auto [Mask, VL] = getDefaultVLOps(VT, ContainerVT, DL, DAG, Subtarget);
SDValue Mask, VL;
if (Op->isVPOpcode()) {
Mask = Op.getOperand(2);
if (VT.isFixedLengthVector())
Mask = convertToScalableVector(getMaskTypeFor(ContainerVT), Mask, DAG,
Subtarget);
VL = Op.getOperand(3);
} else {
std::tie(Mask, VL) = getDefaultVLOps(VT, ContainerVT, DL, DAG, Subtarget);
}

SDValue NewY = Y;
if (!XIsNeverNan) {
Expand All @@ -5422,7 +5433,9 @@ static SDValue lowerFMAXIMUM_FMINIMUM(SDValue Op, SelectionDAG &DAG,
}

unsigned Opc =
Op.getOpcode() == ISD::FMAXIMUM ? RISCVISD::VFMAX_VL : RISCVISD::VFMIN_VL;
Op.getOpcode() == ISD::FMAXIMUM || Op->getOpcode() == ISD::VP_FMAXIMUM
? RISCVISD::VFMAX_VL
: RISCVISD::VFMIN_VL;
SDValue Res = DAG.getNode(Opc, DL, ContainerVT, NewX, NewY,
DAG.getUNDEF(ContainerVT), Mask, VL);
if (VT.isFixedLengthVector())
Expand Down Expand Up @@ -6651,6 +6664,13 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
!Subtarget.hasVInstructionsF16()))
return SplitVPOp(Op, DAG);
return lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND(Op, DAG, Subtarget);
case ISD::VP_FMAXIMUM:
case ISD::VP_FMINIMUM:
if (Op.getValueType() == MVT::nxv32f16 &&
(Subtarget.hasVInstructionsF16Minimal() &&
!Subtarget.hasVInstructionsF16()))
return SplitVPOp(Op, DAG);
return lowerFMAXIMUM_FMINIMUM(Op, DAG, Subtarget);
case ISD::EXPERIMENTAL_VP_SPLICE:
return lowerVPSpliceExperimental(Op, DAG);
case ISD::EXPERIMENTAL_VP_REVERSE:
Expand Down
Loading