Skip to content

Intrinsic: introduce minimumnum and maximumnum #96299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

wzssyqa
Copy link
Contributor

@wzssyqa wzssyqa commented Jun 21, 2024

Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as
+0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: #93033

Currently, on different platform, the behaivor of llvm.minnum is
different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN.
RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86:
Returns NUM but not same with IEEE754-2019's minimumNumber as
     +0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow
IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: llvm#93033
@llvmbot
Copy link
Member

llvmbot commented Jun 21, 2024

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-adt

@llvm/pr-subscribers-backend-aarch64

Author: YunQiang Su (wzssyqa)

Changes

Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as
+0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: #93033


Patch is 163.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96299.diff

65 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+550)
  • (modified) llvm/include/llvm/ADT/APFloat.h (+26)
  • (modified) llvm/include/llvm/Analysis/TargetLibraryInfo.def (+33)
  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+10)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h (+2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h (+12)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h (+1)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h (+24)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/Utils.h (+4)
  • (modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+11)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+7)
  • (modified) llvm/include/llvm/IR/ConstrainedOps.def (+2)
  • (modified) llvm/include/llvm/IR/IRBuilder.h (+12)
  • (modified) llvm/include/llvm/IR/IntrinsicInst.h (+2)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+40)
  • (modified) llvm/include/llvm/IR/RuntimeLibcalls.def (+10)
  • (modified) llvm/include/llvm/IR/VPIntrinsics.def (+22)
  • (modified) llvm/include/llvm/Support/TargetOpcodes.def (+6)
  • (modified) llvm/include/llvm/Target/GenericOpcodes.td (+19)
  • (modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+3-3)
  • (modified) llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td (+4)
  • (modified) llvm/include/llvm/Target/TargetSelectionDAG.td (+18)
  • (modified) llvm/lib/CodeGen/ExpandVectorPredication.cpp (+17-1)
  • (modified) llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp (+2)
  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+2)
  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+8)
  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+51-2)
  • (modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+9-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+9-3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+32)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+61-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (+10)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+14)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+13-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+30)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+8)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+101-5)
  • (modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+1)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+12)
  • (modified) llvm/lib/Target/Hexagon/HexagonISelLowering.cpp (+5)
  • (modified) llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp (+6)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatternsHVX.td (+8)
  • (modified) llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp (+3-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+25-7)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoD.td (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoF.td (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td (+2)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+4)
  • (modified) llvm/lib/Target/X86/X86TargetTransformInfo.cpp (+10-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir (+12)
  • (modified) llvm/test/CodeGen/AArch64/combine_andor_with_cmps.ll (+4-7)
  • (added) llvm/test/CodeGen/AArch64/fp-maximumnum-minimumnum.ll (+51)
  • (modified) llvm/test/CodeGen/Hexagon/fminmax-v67.ll (+40)
  • (added) llvm/test/CodeGen/LoongArch/fp-maximumnum-minimumnum.ll (+154)
  • (added) llvm/test/CodeGen/Mips/fp-maximumnum-minimumnum.ll (+100)
  • (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+128-50)
  • (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+136-48)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+1-1)
  • (modified) llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml (+10-4)
  • (modified) llvm/unittests/ADT/APFloatTest.cpp (+48)
  • (modified) llvm/unittests/Analysis/TargetLibraryInfoTest.cpp (+7-1)
  • (modified) llvm/unittests/IR/VPIntrinsicTest.cpp (+6-5)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 9a0f73e4d6d86..f902e5604e6ae 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -16080,6 +16080,100 @@ The returned value is completely identical to the input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Standard:
+"""""""""
+
+IEEE754 and ISO C define some min/max operations, and they have some differences
+on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
+
+.. list-table::
+   :header-rows: 2
+
+   * - ``ISO C``
+     - fmin/fmax
+     - fmininum/fmaximum
+     - fminimum_num/fmaximum_num
+
+   * - ``IEEE754``
+     - minNum/maxNum (2008)
+     - minimum/maximum (2019)
+     - minimumNumber/maximumNumber (2019)
+
+   * - ``+0.0 vs -0.0``
+     - either one
+     - +0.0 > -0.0
+     - +0.0 > -0.0
+
+   * - ``NUM vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``NUM vs qNaN``
+     - NUM, no exception
+     - qNaN, no exception
+     - NUM, no exception
+
+LLVM Implementation:
+""""""""""""""""""""
+
+LLVM implements all ISO C flavors as listed in this table.
+Only basic intrinsics list here. The constrained version
+ones may have different behaivor on exception.
+
+Since some architectures implement minNum/maxNum with +0.0>-0.0,
+so we define internal ISD::MINNUM_IEEE and ISD::MAXNUM_IEEE.
+They will be helpful to implement minimumnum/maximumnum.
+
+.. list-table::
+   :header-rows: 1
+   :widths: 16 28 28 28
+
+   * - Operation
+     - minnum/maxnum
+     - minimum/maximum
+     - minimumnum/maximumnum
+
+   * - ``NUM vs qNaN``
+     - NUM, no exception
+     - qNaN, no exception
+     - NUM, no exception
+
+   * - ``NUM vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``sNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``+0.0 vs -0.0``
+     - either one
+     - +0.0(max)/-0.0(min)
+     - +0.0(max)/-0.0(min)
+
+   * - ``NUM vs NUM``
+     - larger(max)/smaller(min)
+     - larger(max)/smaller(min)
+     - larger(max)/smaller(min)
+
 .. _i_minnum:
 
 '``llvm.minnum.*``' Intrinsic
@@ -16261,6 +16355,98 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this
 intrinsic. Note that these are the semantics specified in the draft of
 IEEE 754-2019.
 
+.. _i_minimumnum:
+
+'``llvm.minimumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.minimumnum`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+      declare float     @llvm.minimumnum.f32(float %Val0, float %Val1)
+      declare double    @llvm.minimumnum.f64(double %Val0, double %Val1)
+      declare x86_fp80  @llvm.minimumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
+      declare fp128     @llvm.minimumnum.f128(fp128 %Val0, fp128 %Val1)
+      declare ppc_fp128 @llvm.minimumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
+
+Overview:
+"""""""""
+
+The '``llvm.minimumnum.*``' intrinsics return the minimum of the two
+arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+
+Arguments:
+""""""""""
+
+The arguments and return value are floating-point numbers of the same
+type.
+
+Semantics:
+""""""""""
+If both operands are NaNs (including sNaN), returns qNaN. If one operand
+is NaN (including sNaN) and another operand is a number, return the number.
+Otherwise returns the lesser of the two arguments. -0.0 is considered to
+be less than +0.0 for this intrinsic.
+
+Note that these are the semantics of minimumNumber specified in IEEE 754-2019.
+
+It has some different with '``llvm.minnum.*``':
+1)'``llvm.minnum.*``' will return qNaN if either operand is sNaN.
+2)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0.
+
+.. _i_maximumnum:
+
+'``llvm.maximumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.maximumnum`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+      declare float     @llvm.maximumnum.f32(float %Val0, float %Val1)
+      declare double    @llvm.maximumnum.f64(double %Val0, double %Val1)
+      declare x86_fp80  @llvm.maximumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
+      declare fp128     @llvm.maximumnum.f128(fp128 %Val0, fp128 %Val1)
+      declare ppc_fp128 @llvm.maximumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
+
+Overview:
+"""""""""
+
+The '``llvm.maximumnum.*``' intrinsics return the maximum of the two
+arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+
+Arguments:
+""""""""""
+
+The arguments and return value are floating-point numbers of the same
+type.
+
+Semantics:
+""""""""""
+If both operands are NaNs (including sNaN), returns qNaN. If one operand
+is NaN (including sNaN) and another operand is a number, return the number.
+Otherwise returns the greater of the two arguments. -0.0 is considered to
+be less than +0.0 for this intrinsic.
+
+Note that these are the semantics of maximumNumber specified in IEEE 754-2019.
+
+It has some different with '``llvm.maxnum.*``':
+1)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN.
+2)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0.
+
 .. _int_copysign:
 
 '``llvm.copysign.*``' Intrinsic
@@ -19067,6 +19253,65 @@ Arguments:
 """"""""""
 The argument to this intrinsic must be a vector of floating-point values.
 
+.. _int_vector_reduce_fmaximumnum:
+
+'``llvm.vector.reduce.fmaximumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %a)
+      declare double @llvm.vector.reduce.fmaximumnum.v2f64(<2 x double> %a)
+
+Overview:
+"""""""""
+
+The '``llvm.vector.reduce.fmaximumnum.*``' intrinsics do a floating-point
+``MAX`` reduction of a vector, returning the result as a scalar. The return type
+matches the element-type of the vector input.
+
+This instruction has the same comparison semantics as the '``llvm.maximumnum.*``'
+intrinsic. That is, this intrinsic not propagates NaNs (even sNaNs) and +0.0 is
+considered greater than -0.0. If all elements of the vector are NaNs, the result is NaN.
+
+Arguments:
+""""""""""
+The argument to this intrinsic must be a vector of floating-point values.
+
+.. _int_vector_reduce_fminimumnum:
+
+'``llvm.vector.reduce.fminimumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vector.reduce.fminimumnum.v4f32(<4 x float> %a)
+      declare double @llvm.vector.reduce.fminimumnum.v2f64(<2 x double> %a)
+
+Overview:
+"""""""""
+
+The '``llvm.vector.reduce.fminimumnum.*``' intrinsics do a floating-point
+``MIN`` reduction of a vector, returning the result as a scalar. The return type
+matches the element-type of the vector input.
+
+This instruction has the same comparison semantics as the '``llvm.minimumnum.*``'
+intrinsic. That is, this intrinsic not propagates NaNs (even sNaNs) and -0.0 is
+considered less than +0.0. If all elements of the vector are NaNs, the result is NaN.
+
+Arguments:
+""""""""""
+The argument to this intrinsic must be a vector of floating-point values.
+
+
 '``llvm.vector.insert``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -21253,6 +21498,107 @@ Examples:
       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
 
 
+.. _int_vp_minimumnum:
+
+'``llvm.vp.minimumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare <16 x float>  @llvm.vp.minimumnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
+      declare <vscale x 4 x float>  @llvm.vp.minimumnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
+      declare <256 x double>  @llvm.vp.minimumnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point minimum of two vectors of floating-point values,
+not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two operands and the result have the same vector of floating-point type. The
+third operand is the vector mask and has the same number of elements as the
+result vector type. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.minimumnum``' intrinsic performs floating-point minimumNumber (:ref:`minimumnum <i_minimumnum>`)
+of the first and second vector operand on each enabled lane, the result being
+NaN if both operands are NaN. Even sNaN vs Num results NUM. -0.0 is considered
+to be less than +0.0 for this intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
+The operation is performed in the default floating-point environment.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call <4 x float> @llvm.vp.minimumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
+
+      %t = call <4 x float> @llvm.minimumnum.v4f32(<4 x float> %a, <4 x float> %b)
+      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
+
+
+.. _int_vp_maximumnum:
+
+'``llvm.vp.maximumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare <16 x float>  @llvm.vp.maximumnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
+      declare <vscale x 4 x float>  @llvm.vp.maximumnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
+      declare <256 x double>  @llvm.vp.maximumnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point maximum of two vectors of floating-point values,
+not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two operands and the result have the same vector of floating-point type. The
+third operand is the vector mask and has the same number of elements as the
+result vector type. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.maximumnum``' intrinsic performs floating-point maximumNumber (:ref:`maximumnum <i_maximumnum>`)
+of the first and second vector operand on each enabled lane, the result being
+NaN if both operands are NaN. sNaN vs Num results NUM. -0.0 is considered
+to be less than +0.0 for this intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
+The operation is performed in the default floating-point environment.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call <4 x float> @llvm.vp.maximumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
+
+      %t = call <4 x float> @llvm.maximumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
+
+
+
 .. _int_vp_fadd:
 
 '``llvm.vp.fadd.*``' Intrinsics
@@ -22652,6 +22998,146 @@ Examples:
       %also.r = call float @llvm.minimum.f32(float %reduction, float %start)
 
 
+.. _int_vp_reduce_fmaximumnum:
+
+'``llvm.vp.reduce.fmaximumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vp.reduce.fmaximumnum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+      declare double @llvm.vp.reduce.fmaximumnum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fmaximumnum``' intrinsic performs the floating-point ``MAX``
+reduction (:ref:`llvm.vector.reduce.fmaximumnum <int_vector_reduce_fmaximumnum>`) of
+the vector operand ``val`` on each enabled lane, taking the maximum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``.
+If ``ninf`` is set, then the neutral value is the smallest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fmaximumnum <int_vector_reduce_fmaximumnum>` intrinsic (and
+thus the '``llvm.maximumnum.*``' intrinsic). That is, the result will always be a
+number unless all of the elements in the vector and the starting value is
+``NaN``. Namely, this intrinsic not propagates ``NaN``, even for ``sNaN``.
+Also, -0.0 is considered less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call float @llvm.vp.reduce.fmaximumnum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+      ; are treated as though %mask were false for those lanes.
+
+      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+      %reduction = call float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %masked.a)
+      %also.r = call float @llvm.maximumnum.f32(float %reduction, float %start)
+
+
+.. _int_vp_reduce_fminimumnum:
+
+'``llvm.vp.reduce.fminimumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vp.reduce.fminimumnum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+      declare double @llvm.vp.reduce.fminimumnum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fminimumnum``' intrinsic performs the floating-point ``MIN``
+reduction (:ref:`llvm.vector.reduce.fminimumnum <int_vector_reduce_fminimumnum>`) of
+the vector operand ``val`` on each enabled lane, taking the minimum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``.
+If ``ninf`` is set, then the neutral value is the largest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fminimumnum <int_vector_reduce_fminimumnum>` intrinsic (and
+thus the '``llvm.minimumnum.*``' intrinsic). That is, the result will always be a
+number unless all of the elements in the vector and the starting value is
+``NaN``. Namely, this intrinsic not propagates ``NaN``, even for ``sNaN``.
+Also, -0.0 is considered less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call float @llvm.vp.reduce.fmaximumnum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+      ; are treated as though %mask were false for those lanes.
+
+      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+      %reduction = call float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %masked.a)
+      %also.r = call float @llvm.maximumnum.f32(float %reduction, float %start)
+
+
 .. _int_get_active_lane_mask:
 
 '``llvm.get.active.lane.mask.*``' Intrinsics
@@ -26957,6 +27443,70 @@ Semantics:
 This function follows semantics specified in the draft of IEEE 754-2019.
 
 
+'``llvm.experimental.constrained.maximumnum``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.maximumnum(<type> <op1>, <type> <op2>
+                                                metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.maximumnum``' intrinsic returns the maximum
+of the two arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two arguments and the return value are floating-point numbers
+of the same type.
+
+The third argument specifies the exception behavior as described above.
+
+Semantics:
+""""""""""
+
+This function follows semantics specified in IEEE 754-2019.
+
+
+'``llvm.experimental.constrained.minimumnum``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.minimumnum(<type> <op1>, <type> <op2>
+                                                metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.minimumnum``' intrinsic returns the minimum
+of the two arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two arguments and the return value are floating-point numbers
+of the same type.
+
+The third argument specifies the exception behavior as described above.
+
+Semantics:
+""""""""""
+
+This function follows semantics specified in IEEE 754-2019.
+
+
 '``llvm.experimental.constrained.ceil``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index c24eae8da3797..db2fa480655c6 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -1483,6 +1483,19 @@ inline APFloat minimum(const APFloat &A, const APFloat &B) {
   return B < A ? B : A;
 }
 
+/// Implements IEEE 754-2019 minim...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 21, 2024

@llvm/pr-subscribers-llvm-analysis

Author: YunQiang Su (wzssyqa)

Changes

Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as
+0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: #93033


Patch is 163.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96299.diff

65 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+550)
  • (modified) llvm/include/llvm/ADT/APFloat.h (+26)
  • (modified) llvm/include/llvm/Analysis/TargetLibraryInfo.def (+33)
  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+10)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h (+2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h (+12)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h (+1)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h (+24)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/Utils.h (+4)
  • (modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+11)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+7)
  • (modified) llvm/include/llvm/IR/ConstrainedOps.def (+2)
  • (modified) llvm/include/llvm/IR/IRBuilder.h (+12)
  • (modified) llvm/include/llvm/IR/IntrinsicInst.h (+2)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+40)
  • (modified) llvm/include/llvm/IR/RuntimeLibcalls.def (+10)
  • (modified) llvm/include/llvm/IR/VPIntrinsics.def (+22)
  • (modified) llvm/include/llvm/Support/TargetOpcodes.def (+6)
  • (modified) llvm/include/llvm/Target/GenericOpcodes.td (+19)
  • (modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+3-3)
  • (modified) llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td (+4)
  • (modified) llvm/include/llvm/Target/TargetSelectionDAG.td (+18)
  • (modified) llvm/lib/CodeGen/ExpandVectorPredication.cpp (+17-1)
  • (modified) llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp (+2)
  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+2)
  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+8)
  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+51-2)
  • (modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+9-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+9-3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+32)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+61-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (+10)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+14)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+13-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+30)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+8)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+101-5)
  • (modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+1)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+12)
  • (modified) llvm/lib/Target/Hexagon/HexagonISelLowering.cpp (+5)
  • (modified) llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp (+6)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatternsHVX.td (+8)
  • (modified) llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp (+3-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+25-7)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoD.td (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoF.td (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td (+2)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+4)
  • (modified) llvm/lib/Target/X86/X86TargetTransformInfo.cpp (+10-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir (+12)
  • (modified) llvm/test/CodeGen/AArch64/combine_andor_with_cmps.ll (+4-7)
  • (added) llvm/test/CodeGen/AArch64/fp-maximumnum-minimumnum.ll (+51)
  • (modified) llvm/test/CodeGen/Hexagon/fminmax-v67.ll (+40)
  • (added) llvm/test/CodeGen/LoongArch/fp-maximumnum-minimumnum.ll (+154)
  • (added) llvm/test/CodeGen/Mips/fp-maximumnum-minimumnum.ll (+100)
  • (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+128-50)
  • (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+136-48)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+1-1)
  • (modified) llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml (+10-4)
  • (modified) llvm/unittests/ADT/APFloatTest.cpp (+48)
  • (modified) llvm/unittests/Analysis/TargetLibraryInfoTest.cpp (+7-1)
  • (modified) llvm/unittests/IR/VPIntrinsicTest.cpp (+6-5)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 9a0f73e4d6d86..f902e5604e6ae 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -16080,6 +16080,100 @@ The returned value is completely identical to the input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Standard:
+"""""""""
+
+IEEE754 and ISO C define some min/max operations, and they have some differences
+on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
+
+.. list-table::
+   :header-rows: 2
+
+   * - ``ISO C``
+     - fmin/fmax
+     - fmininum/fmaximum
+     - fminimum_num/fmaximum_num
+
+   * - ``IEEE754``
+     - minNum/maxNum (2008)
+     - minimum/maximum (2019)
+     - minimumNumber/maximumNumber (2019)
+
+   * - ``+0.0 vs -0.0``
+     - either one
+     - +0.0 > -0.0
+     - +0.0 > -0.0
+
+   * - ``NUM vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``NUM vs qNaN``
+     - NUM, no exception
+     - qNaN, no exception
+     - NUM, no exception
+
+LLVM Implementation:
+""""""""""""""""""""
+
+LLVM implements all ISO C flavors as listed in this table.
+Only basic intrinsics list here. The constrained version
+ones may have different behaivor on exception.
+
+Since some architectures implement minNum/maxNum with +0.0>-0.0,
+so we define internal ISD::MINNUM_IEEE and ISD::MAXNUM_IEEE.
+They will be helpful to implement minimumnum/maximumnum.
+
+.. list-table::
+   :header-rows: 1
+   :widths: 16 28 28 28
+
+   * - Operation
+     - minnum/maxnum
+     - minimum/maximum
+     - minimumnum/maximumnum
+
+   * - ``NUM vs qNaN``
+     - NUM, no exception
+     - qNaN, no exception
+     - NUM, no exception
+
+   * - ``NUM vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``sNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``+0.0 vs -0.0``
+     - either one
+     - +0.0(max)/-0.0(min)
+     - +0.0(max)/-0.0(min)
+
+   * - ``NUM vs NUM``
+     - larger(max)/smaller(min)
+     - larger(max)/smaller(min)
+     - larger(max)/smaller(min)
+
 .. _i_minnum:
 
 '``llvm.minnum.*``' Intrinsic
@@ -16261,6 +16355,98 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this
 intrinsic. Note that these are the semantics specified in the draft of
 IEEE 754-2019.
 
+.. _i_minimumnum:
+
+'``llvm.minimumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.minimumnum`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+      declare float     @llvm.minimumnum.f32(float %Val0, float %Val1)
+      declare double    @llvm.minimumnum.f64(double %Val0, double %Val1)
+      declare x86_fp80  @llvm.minimumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
+      declare fp128     @llvm.minimumnum.f128(fp128 %Val0, fp128 %Val1)
+      declare ppc_fp128 @llvm.minimumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
+
+Overview:
+"""""""""
+
+The '``llvm.minimumnum.*``' intrinsics return the minimum of the two
+arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+
+Arguments:
+""""""""""
+
+The arguments and return value are floating-point numbers of the same
+type.
+
+Semantics:
+""""""""""
+If both operands are NaNs (including sNaN), returns qNaN. If one operand
+is NaN (including sNaN) and another operand is a number, return the number.
+Otherwise returns the lesser of the two arguments. -0.0 is considered to
+be less than +0.0 for this intrinsic.
+
+Note that these are the semantics of minimumNumber specified in IEEE 754-2019.
+
+It has some different with '``llvm.minnum.*``':
+1)'``llvm.minnum.*``' will return qNaN if either operand is sNaN.
+2)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0.
+
+.. _i_maximumnum:
+
+'``llvm.maximumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.maximumnum`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+      declare float     @llvm.maximumnum.f32(float %Val0, float %Val1)
+      declare double    @llvm.maximumnum.f64(double %Val0, double %Val1)
+      declare x86_fp80  @llvm.maximumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
+      declare fp128     @llvm.maximumnum.f128(fp128 %Val0, fp128 %Val1)
+      declare ppc_fp128 @llvm.maximumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
+
+Overview:
+"""""""""
+
+The '``llvm.maximumnum.*``' intrinsics return the maximum of the two
+arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+
+Arguments:
+""""""""""
+
+The arguments and return value are floating-point numbers of the same
+type.
+
+Semantics:
+""""""""""
+If both operands are NaNs (including sNaN), returns qNaN. If one operand
+is NaN (including sNaN) and another operand is a number, return the number.
+Otherwise returns the greater of the two arguments. -0.0 is considered to
+be less than +0.0 for this intrinsic.
+
+Note that these are the semantics of maximumNumber specified in IEEE 754-2019.
+
+It has some different with '``llvm.maxnum.*``':
+1)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN.
+2)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0.
+
 .. _int_copysign:
 
 '``llvm.copysign.*``' Intrinsic
@@ -19067,6 +19253,65 @@ Arguments:
 """"""""""
 The argument to this intrinsic must be a vector of floating-point values.
 
+.. _int_vector_reduce_fmaximumnum:
+
+'``llvm.vector.reduce.fmaximumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %a)
+      declare double @llvm.vector.reduce.fmaximumnum.v2f64(<2 x double> %a)
+
+Overview:
+"""""""""
+
+The '``llvm.vector.reduce.fmaximumnum.*``' intrinsics do a floating-point
+``MAX`` reduction of a vector, returning the result as a scalar. The return type
+matches the element-type of the vector input.
+
+This instruction has the same comparison semantics as the '``llvm.maximumnum.*``'
+intrinsic. That is, this intrinsic not propagates NaNs (even sNaNs) and +0.0 is
+considered greater than -0.0. If all elements of the vector are NaNs, the result is NaN.
+
+Arguments:
+""""""""""
+The argument to this intrinsic must be a vector of floating-point values.
+
+.. _int_vector_reduce_fminimumnum:
+
+'``llvm.vector.reduce.fminimumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vector.reduce.fminimumnum.v4f32(<4 x float> %a)
+      declare double @llvm.vector.reduce.fminimumnum.v2f64(<2 x double> %a)
+
+Overview:
+"""""""""
+
+The '``llvm.vector.reduce.fminimumnum.*``' intrinsics do a floating-point
+``MIN`` reduction of a vector, returning the result as a scalar. The return type
+matches the element-type of the vector input.
+
+This instruction has the same comparison semantics as the '``llvm.minimumnum.*``'
+intrinsic. That is, this intrinsic not propagates NaNs (even sNaNs) and -0.0 is
+considered less than +0.0. If all elements of the vector are NaNs, the result is NaN.
+
+Arguments:
+""""""""""
+The argument to this intrinsic must be a vector of floating-point values.
+
+
 '``llvm.vector.insert``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -21253,6 +21498,107 @@ Examples:
       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
 
 
+.. _int_vp_minimumnum:
+
+'``llvm.vp.minimumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare <16 x float>  @llvm.vp.minimumnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
+      declare <vscale x 4 x float>  @llvm.vp.minimumnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
+      declare <256 x double>  @llvm.vp.minimumnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point minimum of two vectors of floating-point values,
+not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two operands and the result have the same vector of floating-point type. The
+third operand is the vector mask and has the same number of elements as the
+result vector type. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.minimumnum``' intrinsic performs floating-point minimumNumber (:ref:`minimumnum <i_minimumnum>`)
+of the first and second vector operand on each enabled lane, the result being
+NaN if both operands are NaN. Even sNaN vs Num results NUM. -0.0 is considered
+to be less than +0.0 for this intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
+The operation is performed in the default floating-point environment.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call <4 x float> @llvm.vp.minimumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
+
+      %t = call <4 x float> @llvm.minimumnum.v4f32(<4 x float> %a, <4 x float> %b)
+      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
+
+
+.. _int_vp_maximumnum:
+
+'``llvm.vp.maximumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare <16 x float>  @llvm.vp.maximumnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
+      declare <vscale x 4 x float>  @llvm.vp.maximumnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
+      declare <256 x double>  @llvm.vp.maximumnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point maximum of two vectors of floating-point values,
+not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two operands and the result have the same vector of floating-point type. The
+third operand is the vector mask and has the same number of elements as the
+result vector type. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.maximumnum``' intrinsic performs floating-point maximumNumber (:ref:`maximumnum <i_maximumnum>`)
+of the first and second vector operand on each enabled lane, the result being
+NaN if both operands are NaN. sNaN vs Num results NUM. -0.0 is considered
+to be less than +0.0 for this intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
+The operation is performed in the default floating-point environment.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call <4 x float> @llvm.vp.maximumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
+
+      %t = call <4 x float> @llvm.maximumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
+
+
+
 .. _int_vp_fadd:
 
 '``llvm.vp.fadd.*``' Intrinsics
@@ -22652,6 +22998,146 @@ Examples:
       %also.r = call float @llvm.minimum.f32(float %reduction, float %start)
 
 
+.. _int_vp_reduce_fmaximumnum:
+
+'``llvm.vp.reduce.fmaximumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vp.reduce.fmaximumnum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+      declare double @llvm.vp.reduce.fmaximumnum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fmaximumnum``' intrinsic performs the floating-point ``MAX``
+reduction (:ref:`llvm.vector.reduce.fmaximumnum <int_vector_reduce_fmaximumnum>`) of
+the vector operand ``val`` on each enabled lane, taking the maximum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``.
+If ``ninf`` is set, then the neutral value is the smallest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fmaximumnum <int_vector_reduce_fmaximumnum>` intrinsic (and
+thus the '``llvm.maximumnum.*``' intrinsic). That is, the result will always be a
+number unless all of the elements in the vector and the starting value is
+``NaN``. Namely, this intrinsic not propagates ``NaN``, even for ``sNaN``.
+Also, -0.0 is considered less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call float @llvm.vp.reduce.fmaximumnum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+      ; are treated as though %mask were false for those lanes.
+
+      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+      %reduction = call float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %masked.a)
+      %also.r = call float @llvm.maximumnum.f32(float %reduction, float %start)
+
+
+.. _int_vp_reduce_fminimumnum:
+
+'``llvm.vp.reduce.fminimumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vp.reduce.fminimumnum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+      declare double @llvm.vp.reduce.fminimumnum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fminimumnum``' intrinsic performs the floating-point ``MIN``
+reduction (:ref:`llvm.vector.reduce.fminimumnum <int_vector_reduce_fminimumnum>`) of
+the vector operand ``val`` on each enabled lane, taking the minimum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``.
+If ``ninf`` is set, then the neutral value is the largest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fminimumnum <int_vector_reduce_fminimumnum>` intrinsic (and
+thus the '``llvm.minimumnum.*``' intrinsic). That is, the result will always be a
+number unless all of the elements in the vector and the starting value is
+``NaN``. Namely, this intrinsic not propagates ``NaN``, even for ``sNaN``.
+Also, -0.0 is considered less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call float @llvm.vp.reduce.fmaximumnum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+      ; are treated as though %mask were false for those lanes.
+
+      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+      %reduction = call float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %masked.a)
+      %also.r = call float @llvm.maximumnum.f32(float %reduction, float %start)
+
+
 .. _int_get_active_lane_mask:
 
 '``llvm.get.active.lane.mask.*``' Intrinsics
@@ -26957,6 +27443,70 @@ Semantics:
 This function follows semantics specified in the draft of IEEE 754-2019.
 
 
+'``llvm.experimental.constrained.maximumnum``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.maximumnum(<type> <op1>, <type> <op2>
+                                                metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.maximumnum``' intrinsic returns the maximum
+of the two arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two arguments and the return value are floating-point numbers
+of the same type.
+
+The third argument specifies the exception behavior as described above.
+
+Semantics:
+""""""""""
+
+This function follows semantics specified in IEEE 754-2019.
+
+
+'``llvm.experimental.constrained.minimumnum``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.minimumnum(<type> <op1>, <type> <op2>
+                                                metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.minimumnum``' intrinsic returns the minimum
+of the two arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two arguments and the return value are floating-point numbers
+of the same type.
+
+The third argument specifies the exception behavior as described above.
+
+Semantics:
+""""""""""
+
+This function follows semantics specified in IEEE 754-2019.
+
+
 '``llvm.experimental.constrained.ceil``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index c24eae8da3797..db2fa480655c6 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -1483,6 +1483,19 @@ inline APFloat minimum(const APFloat &A, const APFloat &B) {
   return B < A ? B : A;
 }
 
+/// Implements IEEE 754-2019 minim...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 21, 2024

@llvm/pr-subscribers-backend-risc-v

Author: YunQiang Su (wzssyqa)

Changes

Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as
+0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: #93033


Patch is 163.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96299.diff

65 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+550)
  • (modified) llvm/include/llvm/ADT/APFloat.h (+26)
  • (modified) llvm/include/llvm/Analysis/TargetLibraryInfo.def (+33)
  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+10)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h (+2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h (+12)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h (+1)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h (+24)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/Utils.h (+4)
  • (modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+11)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+7)
  • (modified) llvm/include/llvm/IR/ConstrainedOps.def (+2)
  • (modified) llvm/include/llvm/IR/IRBuilder.h (+12)
  • (modified) llvm/include/llvm/IR/IntrinsicInst.h (+2)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+40)
  • (modified) llvm/include/llvm/IR/RuntimeLibcalls.def (+10)
  • (modified) llvm/include/llvm/IR/VPIntrinsics.def (+22)
  • (modified) llvm/include/llvm/Support/TargetOpcodes.def (+6)
  • (modified) llvm/include/llvm/Target/GenericOpcodes.td (+19)
  • (modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+3-3)
  • (modified) llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td (+4)
  • (modified) llvm/include/llvm/Target/TargetSelectionDAG.td (+18)
  • (modified) llvm/lib/CodeGen/ExpandVectorPredication.cpp (+17-1)
  • (modified) llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp (+2)
  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+2)
  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+8)
  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+51-2)
  • (modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+9-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+9-3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+32)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+61-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (+10)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+14)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+13-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+30)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+8)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+101-5)
  • (modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+1)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+12)
  • (modified) llvm/lib/Target/Hexagon/HexagonISelLowering.cpp (+5)
  • (modified) llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp (+6)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+4)
  • (modified) llvm/lib/Target/Hexagon/HexagonPatternsHVX.td (+8)
  • (modified) llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp (+3-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+25-7)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoD.td (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoF.td (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td (+2)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+4)
  • (modified) llvm/lib/Target/X86/X86TargetTransformInfo.cpp (+10-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir (+12)
  • (modified) llvm/test/CodeGen/AArch64/combine_andor_with_cmps.ll (+4-7)
  • (added) llvm/test/CodeGen/AArch64/fp-maximumnum-minimumnum.ll (+51)
  • (modified) llvm/test/CodeGen/Hexagon/fminmax-v67.ll (+40)
  • (added) llvm/test/CodeGen/LoongArch/fp-maximumnum-minimumnum.ll (+154)
  • (added) llvm/test/CodeGen/Mips/fp-maximumnum-minimumnum.ll (+100)
  • (modified) llvm/test/CodeGen/RISCV/double-intrinsics.ll (+128-50)
  • (modified) llvm/test/CodeGen/RISCV/float-intrinsics.ll (+136-48)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+1-1)
  • (modified) llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml (+10-4)
  • (modified) llvm/unittests/ADT/APFloatTest.cpp (+48)
  • (modified) llvm/unittests/Analysis/TargetLibraryInfoTest.cpp (+7-1)
  • (modified) llvm/unittests/IR/VPIntrinsicTest.cpp (+6-5)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 9a0f73e4d6d86..f902e5604e6ae 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -16080,6 +16080,100 @@ The returned value is completely identical to the input except for the sign bit;
 in particular, if the input is a NaN, then the quiet/signaling bit and payload
 are perfectly preserved.
 
+.. _i_fminmax_family:
+
+'``llvm.min.*``' Intrinsics Comparation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Standard:
+"""""""""
+
+IEEE754 and ISO C define some min/max operations, and they have some differences
+on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
+
+.. list-table::
+   :header-rows: 2
+
+   * - ``ISO C``
+     - fmin/fmax
+     - fmininum/fmaximum
+     - fminimum_num/fmaximum_num
+
+   * - ``IEEE754``
+     - minNum/maxNum (2008)
+     - minimum/maximum (2019)
+     - minimumNumber/maximumNumber (2019)
+
+   * - ``+0.0 vs -0.0``
+     - either one
+     - +0.0 > -0.0
+     - +0.0 > -0.0
+
+   * - ``NUM vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``NUM vs qNaN``
+     - NUM, no exception
+     - qNaN, no exception
+     - NUM, no exception
+
+LLVM Implementation:
+""""""""""""""""""""
+
+LLVM implements all ISO C flavors as listed in this table.
+Only basic intrinsics list here. The constrained version
+ones may have different behaivor on exception.
+
+Since some architectures implement minNum/maxNum with +0.0>-0.0,
+so we define internal ISD::MINNUM_IEEE and ISD::MAXNUM_IEEE.
+They will be helpful to implement minimumnum/maximumnum.
+
+.. list-table::
+   :header-rows: 1
+   :widths: 16 28 28 28
+
+   * - Operation
+     - minnum/maxnum
+     - minimum/maximum
+     - minimumnum/maximumnum
+
+   * - ``NUM vs qNaN``
+     - NUM, no exception
+     - qNaN, no exception
+     - NUM, no exception
+
+   * - ``NUM vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - NUM, invalid exception
+
+   * - ``qNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``sNaN vs sNaN``
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+     - qNaN, invalid exception
+
+   * - ``+0.0 vs -0.0``
+     - either one
+     - +0.0(max)/-0.0(min)
+     - +0.0(max)/-0.0(min)
+
+   * - ``NUM vs NUM``
+     - larger(max)/smaller(min)
+     - larger(max)/smaller(min)
+     - larger(max)/smaller(min)
+
 .. _i_minnum:
 
 '``llvm.minnum.*``' Intrinsic
@@ -16261,6 +16355,98 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this
 intrinsic. Note that these are the semantics specified in the draft of
 IEEE 754-2019.
 
+.. _i_minimumnum:
+
+'``llvm.minimumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.minimumnum`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+      declare float     @llvm.minimumnum.f32(float %Val0, float %Val1)
+      declare double    @llvm.minimumnum.f64(double %Val0, double %Val1)
+      declare x86_fp80  @llvm.minimumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
+      declare fp128     @llvm.minimumnum.f128(fp128 %Val0, fp128 %Val1)
+      declare ppc_fp128 @llvm.minimumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
+
+Overview:
+"""""""""
+
+The '``llvm.minimumnum.*``' intrinsics return the minimum of the two
+arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+
+Arguments:
+""""""""""
+
+The arguments and return value are floating-point numbers of the same
+type.
+
+Semantics:
+""""""""""
+If both operands are NaNs (including sNaN), returns qNaN. If one operand
+is NaN (including sNaN) and another operand is a number, return the number.
+Otherwise returns the lesser of the two arguments. -0.0 is considered to
+be less than +0.0 for this intrinsic.
+
+Note that these are the semantics of minimumNumber specified in IEEE 754-2019.
+
+It has some different with '``llvm.minnum.*``':
+1)'``llvm.minnum.*``' will return qNaN if either operand is sNaN.
+2)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0.
+
+.. _i_maximumnum:
+
+'``llvm.maximumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.maximumnum`` on any
+floating-point or vector of floating-point type. Not all targets support
+all types however.
+
+::
+
+      declare float     @llvm.maximumnum.f32(float %Val0, float %Val1)
+      declare double    @llvm.maximumnum.f64(double %Val0, double %Val1)
+      declare x86_fp80  @llvm.maximumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
+      declare fp128     @llvm.maximumnum.f128(fp128 %Val0, fp128 %Val1)
+      declare ppc_fp128 @llvm.maximumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
+
+Overview:
+"""""""""
+
+The '``llvm.maximumnum.*``' intrinsics return the maximum of the two
+arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+
+Arguments:
+""""""""""
+
+The arguments and return value are floating-point numbers of the same
+type.
+
+Semantics:
+""""""""""
+If both operands are NaNs (including sNaN), returns qNaN. If one operand
+is NaN (including sNaN) and another operand is a number, return the number.
+Otherwise returns the greater of the two arguments. -0.0 is considered to
+be less than +0.0 for this intrinsic.
+
+Note that these are the semantics of maximumNumber specified in IEEE 754-2019.
+
+It has some different with '``llvm.maxnum.*``':
+1)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN.
+2)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0.
+
 .. _int_copysign:
 
 '``llvm.copysign.*``' Intrinsic
@@ -19067,6 +19253,65 @@ Arguments:
 """"""""""
 The argument to this intrinsic must be a vector of floating-point values.
 
+.. _int_vector_reduce_fmaximumnum:
+
+'``llvm.vector.reduce.fmaximumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %a)
+      declare double @llvm.vector.reduce.fmaximumnum.v2f64(<2 x double> %a)
+
+Overview:
+"""""""""
+
+The '``llvm.vector.reduce.fmaximumnum.*``' intrinsics do a floating-point
+``MAX`` reduction of a vector, returning the result as a scalar. The return type
+matches the element-type of the vector input.
+
+This instruction has the same comparison semantics as the '``llvm.maximumnum.*``'
+intrinsic. That is, this intrinsic not propagates NaNs (even sNaNs) and +0.0 is
+considered greater than -0.0. If all elements of the vector are NaNs, the result is NaN.
+
+Arguments:
+""""""""""
+The argument to this intrinsic must be a vector of floating-point values.
+
+.. _int_vector_reduce_fminimumnum:
+
+'``llvm.vector.reduce.fminimumnum.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vector.reduce.fminimumnum.v4f32(<4 x float> %a)
+      declare double @llvm.vector.reduce.fminimumnum.v2f64(<2 x double> %a)
+
+Overview:
+"""""""""
+
+The '``llvm.vector.reduce.fminimumnum.*``' intrinsics do a floating-point
+``MIN`` reduction of a vector, returning the result as a scalar. The return type
+matches the element-type of the vector input.
+
+This instruction has the same comparison semantics as the '``llvm.minimumnum.*``'
+intrinsic. That is, this intrinsic not propagates NaNs (even sNaNs) and -0.0 is
+considered less than +0.0. If all elements of the vector are NaNs, the result is NaN.
+
+Arguments:
+""""""""""
+The argument to this intrinsic must be a vector of floating-point values.
+
+
 '``llvm.vector.insert``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -21253,6 +21498,107 @@ Examples:
       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
 
 
+.. _int_vp_minimumnum:
+
+'``llvm.vp.minimumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare <16 x float>  @llvm.vp.minimumnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
+      declare <vscale x 4 x float>  @llvm.vp.minimumnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
+      declare <256 x double>  @llvm.vp.minimumnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point minimum of two vectors of floating-point values,
+not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two operands and the result have the same vector of floating-point type. The
+third operand is the vector mask and has the same number of elements as the
+result vector type. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.minimumnum``' intrinsic performs floating-point minimumNumber (:ref:`minimumnum <i_minimumnum>`)
+of the first and second vector operand on each enabled lane, the result being
+NaN if both operands are NaN. Even sNaN vs Num results NUM. -0.0 is considered
+to be less than +0.0 for this intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
+The operation is performed in the default floating-point environment.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call <4 x float> @llvm.vp.minimumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
+
+      %t = call <4 x float> @llvm.minimumnum.v4f32(<4 x float> %a, <4 x float> %b)
+      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
+
+
+.. _int_vp_maximumnum:
+
+'``llvm.vp.maximumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare <16 x float>  @llvm.vp.maximumnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
+      declare <vscale x 4 x float>  @llvm.vp.maximumnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
+      declare <256 x double>  @llvm.vp.maximumnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point maximum of two vectors of floating-point values,
+not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two operands and the result have the same vector of floating-point type. The
+third operand is the vector mask and has the same number of elements as the
+result vector type. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.maximumnum``' intrinsic performs floating-point maximumNumber (:ref:`maximumnum <i_maximumnum>`)
+of the first and second vector operand on each enabled lane, the result being
+NaN if both operands are NaN. sNaN vs Num results NUM. -0.0 is considered
+to be less than +0.0 for this intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
+The operation is performed in the default floating-point environment.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call <4 x float> @llvm.vp.maximumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
+
+      %t = call <4 x float> @llvm.maximumnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
+      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
+
+
+
 .. _int_vp_fadd:
 
 '``llvm.vp.fadd.*``' Intrinsics
@@ -22652,6 +22998,146 @@ Examples:
       %also.r = call float @llvm.minimum.f32(float %reduction, float %start)
 
 
+.. _int_vp_reduce_fmaximumnum:
+
+'``llvm.vp.reduce.fmaximumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vp.reduce.fmaximumnum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+      declare double @llvm.vp.reduce.fmaximumnum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fmaximumnum``' intrinsic performs the floating-point ``MAX``
+reduction (:ref:`llvm.vector.reduce.fmaximumnum <int_vector_reduce_fmaximumnum>`) of
+the vector operand ``val`` on each enabled lane, taking the maximum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``.
+If ``ninf`` is set, then the neutral value is the smallest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fmaximumnum <int_vector_reduce_fmaximumnum>` intrinsic (and
+thus the '``llvm.maximumnum.*``' intrinsic). That is, the result will always be a
+number unless all of the elements in the vector and the starting value is
+``NaN``. Namely, this intrinsic not propagates ``NaN``, even for ``sNaN``.
+Also, -0.0 is considered less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call float @llvm.vp.reduce.fmaximumnum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+      ; are treated as though %mask were false for those lanes.
+
+      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+      %reduction = call float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %masked.a)
+      %also.r = call float @llvm.maximumnum.f32(float %reduction, float %start)
+
+
+.. _int_vp_reduce_fminimumnum:
+
+'``llvm.vp.reduce.fminimumnum.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic.
+
+::
+
+      declare float @llvm.vp.reduce.fminimumnum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
+      declare double @llvm.vp.reduce.fminimumnum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
+
+Overview:
+"""""""""
+
+Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
+value, returning the result as a scalar.
+
+
+Arguments:
+""""""""""
+
+The first operand is the start value of the reduction, which must be a scalar
+floating-point type equal to the result type. The second operand is the vector
+on which the reduction is performed and must be a vector of floating-point
+values whose element type is the result/start type. The third operand is the
+vector mask and is a vector of boolean values with the same number of elements
+as the vector operand. The fourth operand is the explicit vector length of the
+operation.
+
+Semantics:
+""""""""""
+
+The '``llvm.vp.reduce.fminimumnum``' intrinsic performs the floating-point ``MIN``
+reduction (:ref:`llvm.vector.reduce.fminimumnum <int_vector_reduce_fminimumnum>`) of
+the vector operand ``val`` on each enabled lane, taking the minimum of that and
+the scalar ``start_value``. Disabled lanes are treated as containing the
+neutral value (i.e. having no effect on the reduction operation). If the vector
+length is zero, the result is the start value.
+
+The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
+flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``.
+If ``ninf`` is set, then the neutral value is the largest floating-point value
+for the result type.
+
+This instruction has the same comparison semantics as the
+:ref:`llvm.vector.reduce.fminimumnum <int_vector_reduce_fminimumnum>` intrinsic (and
+thus the '``llvm.minimumnum.*``' intrinsic). That is, the result will always be a
+number unless all of the elements in the vector and the starting value is
+``NaN``. Namely, this intrinsic not propagates ``NaN``, even for ``sNaN``.
+Also, -0.0 is considered less than +0.0.
+
+To ignore the start value, the neutral value can be used.
+
+Examples:
+"""""""""
+
+.. code-block:: llvm
+
+      %r = call float @llvm.vp.reduce.fmaximumnum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
+      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
+      ; are treated as though %mask were false for those lanes.
+
+      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
+      %reduction = call float @llvm.vector.reduce.fmaximumnum.v4f32(<4 x float> %masked.a)
+      %also.r = call float @llvm.maximumnum.f32(float %reduction, float %start)
+
+
 .. _int_get_active_lane_mask:
 
 '``llvm.get.active.lane.mask.*``' Intrinsics
@@ -26957,6 +27443,70 @@ Semantics:
 This function follows semantics specified in the draft of IEEE 754-2019.
 
 
+'``llvm.experimental.constrained.maximumnum``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.maximumnum(<type> <op1>, <type> <op2>
+                                                metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.maximumnum``' intrinsic returns the maximum
+of the two arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two arguments and the return value are floating-point numbers
+of the same type.
+
+The third argument specifies the exception behavior as described above.
+
+Semantics:
+""""""""""
+
+This function follows semantics specified in IEEE 754-2019.
+
+
+'``llvm.experimental.constrained.minimumnum``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <type>
+      @llvm.experimental.constrained.minimumnum(<type> <op1>, <type> <op2>
+                                                metadata <exception behavior>)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.constrained.minimumnum``' intrinsic returns the minimum
+of the two arguments, not propagating NaNs and treating -0.0 as less than +0.0.
+
+Arguments:
+""""""""""
+
+The first two arguments and the return value are floating-point numbers
+of the same type.
+
+The third argument specifies the exception behavior as described above.
+
+Semantics:
+""""""""""
+
+This function follows semantics specified in IEEE 754-2019.
+
+
 '``llvm.experimental.constrained.ceil``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index c24eae8da3797..db2fa480655c6 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -1483,6 +1483,19 @@ inline APFloat minimum(const APFloat &A, const APFloat &B) {
   return B < A ? B : A;
 }
 
+/// Implements IEEE 754-2019 minim...
[truncated]

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Jun 21, 2024

@tschuett
Copy link

This is a huge patch. Please split out/stack the backend changes.

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Jun 21, 2024

Some background here: #93841

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Jun 21, 2024

This is a huge patch. Please split out/stack the backend changes.

It is not easy to split, as we need to make test always green.
Maybe we can split APFloat out first: #96304

@tschuett
Copy link

First introduce the intrinsics and in later patches you can update the backends.

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Jun 25, 2024

IR support and SelectionDAG support have been splited as a new PR.

#96649

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Jun 28, 2024

Close this one, due to that we are working on split it to several PRs.

See:
#96304
#96649
#96281

@wzssyqa wzssyqa closed this Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

llvm.minnum should be lowered to fminimum_numf instead of fminf
3 participants