[IR] Add `llvm.modf` intrinsic #121948

MacDue · 2025-01-07T14:56:42Z

This adds the llvm.modf intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp).

The llvm.modf intrinsic takes a floating-point value and returns both the integral and fractional parts (as a struct).

declare { float, float }             @llvm.modf.f32(float  %Val)
declare { double, double }           @llvm.modf.f64(double %Val)
declare { x86_fp80, x86_fp80 }       @llvm.modf.f80(x86_fp80  %Val)
declare { fp128, fp128 }             @llvm.modf.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }     @llvm.modf.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float>  %Val)

This corresponds to the libm modf function but returns multiple values in a struct (rather than take output pointers), which makes it easier to vectorize.

llvmbot · 2025-01-07T14:57:15Z

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: Benjamin Maxwell (MacDue)

Changes

This adds the llvm.modf intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp).

The llvm.modf intrinsic takes a floating-point value and returns both the integral and fractional parts (as a struct).

declare { float, float }             @<!-- -->llvm.modf.f32(float  %Val)
declare { double, double }           @<!-- -->llvm.modf.f64(double %Val)
declare { x86_fp80, x86_fp80 }       @<!-- -->llvm.modf.f80(x86_fp80  %Val)
declare { fp128, fp128 }             @<!-- -->llvm.modf.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }     @<!-- -->llvm.modf.ppcf128(ppc_fp128  %Val)
declare { &lt;4 x float&gt;, &lt;4 x float&gt; } @<!-- -->llvm.modf.v4f32(&lt;4 x float&gt;  %Val)

This corresponds to the libm modf function but returns multiple values in a struct (rather than take output pointers), which makes it easier to vectorize.

Patch is 28.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121948.diff

16 Files Affected:

(modified) llvm/docs/LangRef.rst (+43)
(modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+3)
(modified) llvm/include/llvm/CodeGen/ISDOpcodes.h (+4)
(modified) llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h (+4)
(modified) llvm/include/llvm/IR/Intrinsics.td (+2)
(modified) llvm/include/llvm/IR/RuntimeLibcalls.def (+5)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+7-3)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+2-1)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (+9)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+3)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+4)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+1)
(modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+8-3)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+13-13)
(added) llvm/test/CodeGen/AArch64/llvm.modf.ll (+255)
(added) llvm/test/CodeGen/AArch64/veclib-llvm.modf.ll (+57)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 7e01331b20c570..bd067fa5d1c4b4 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -16004,6 +16004,49 @@ of the argument.
 When specified with the fast-math-flag 'afn', the result may be approximated
 using a less accurate calculation.
 
+'``llvm.modf.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.modf`` on any floating-point
+or vector of floating-point type. However, not all targets support all types.
+
+::
+
+ declare { float, float }             @llvm.modf.f32(float  %Val)
+ declare { double, double }           @llvm.modf.f64(double %Val)
+ declare { x86_fp80, x86_fp80 }       @llvm.modf.f80(x86_fp80  %Val)
+ declare { fp128, fp128 }             @llvm.modf.f128(fp128 %Val)
+ declare { ppc_fp128, ppc_fp128 }     @llvm.modf.ppcf128(ppc_fp128  %Val)
+ declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float>  %Val)
+
+Overview:
+"""""""""
+
+The '``llvm.modf.*``' intrinsics return the operand's integral and fractional
+parts.
+
+Arguments:
+""""""""""
+
+The argument is a :ref:`floating-point <t_floating>` value or
+:ref:`vector <t_vector>` of floating-point values. Returns two values matching
+the argument type in a struct.
+
+Semantics:
+""""""""""
+
+Return the same values as a corresponding libm '``modf``' function without
+trapping or setting ``errno``.
+
+The first result is the fractional part of the operand and the second result is
+the integral part of the operand. Both results have the same sign as the operand.
+
+When specified with the fast-math-flag 'afn', the result may be approximated
+using a less accurate calculation.
+
 '``llvm.pow.*``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index c9f142d64ae9e4..46778ca23aeed1 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -2078,6 +2078,9 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
     case Intrinsic::sincos:
       ISD = ISD::FSINCOS;
       break;
+    case Intrinsic::modf:
+      ISD = ISD::FMODF;
+      break;
     case Intrinsic::tan:
       ISD = ISD::FTAN;
       break;
diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h
index 604dc9419025b0..9514cb399c7d81 100644
--- a/llvm/include/llvm/CodeGen/ISDOpcodes.h
+++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h
@@ -1058,6 +1058,10 @@ enum NodeType {
   /// FSINCOS - Compute both fsin and fcos as a single operation.
   FSINCOS,
 
+  /// FMODF - Decomposes the given arg in integral and fractional parts, each
+  /// having the same type and sign as the arg.
+  FMODF,
+
   /// Gets the current floating-point environment. The first operand is a token
   /// chain. The results are FP environment, represented by an integer value,
   /// and a token chain.
diff --git a/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h b/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
index 045ec7d3653119..59313520e0d831 100644
--- a/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
+++ b/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
@@ -66,6 +66,10 @@ Libcall getFREXP(EVT RetVT);
 /// UNKNOWN_LIBCALL if there is none.
 Libcall getFSINCOS(EVT RetVT);
 
+/// getMODF - Return the MODF_* value for the given types, or
+/// UNKNOWN_LIBCALL if there is none.
+Libcall getMODF(EVT RetVT);
+
 /// Return the SYNC_FETCH_AND_* value for the given opcode and type, or
 /// UNKNOWN_LIBCALL if there is none.
 Libcall getSYNC(unsigned Opc, MVT VT);
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index ee877349a33149..2c22060237faa6 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1063,6 +1063,8 @@ let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrWillReturn] in {
   def int_roundeven    : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>]>;
   def int_sincos : DefaultAttrsIntrinsic<[LLVMMatchType<0>, LLVMMatchType<0>],
                              [llvm_anyfloat_ty]>;
+  def int_modf : DefaultAttrsIntrinsic<[LLVMMatchType<0>, LLVMMatchType<0>],
+                             [llvm_anyfloat_ty]>;
 
   // Truncate a floating point number with a specific rounding mode
   def int_fptrunc_round : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.def b/llvm/include/llvm/IR/RuntimeLibcalls.def
index 8153845b52c7ae..dc69b1ae19769e 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.def
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.def
@@ -354,6 +354,11 @@ HANDLE_LIBCALL(FREXP_F64, "frexp")
 HANDLE_LIBCALL(FREXP_F80, "frexpl")
 HANDLE_LIBCALL(FREXP_F128, "frexpl")
 HANDLE_LIBCALL(FREXP_PPCF128, "frexpl")
+HANDLE_LIBCALL(MODF_F32, "modff")
+HANDLE_LIBCALL(MODF_F64, "modf")
+HANDLE_LIBCALL(MODF_F80, "modfl")
+HANDLE_LIBCALL(MODF_F128, "modfl")
+HANDLE_LIBCALL(MODF_PPCF128, "modfl")
 
 // Floating point environment
 HANDLE_LIBCALL(FEGETENV, "fegetenv")
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index 595a410101eca1..f080c3aa562e4a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -4609,12 +4609,15 @@ void SelectionDAGLegalize::ConvertNodeToLibcall(SDNode *Node) {
     ExpandFPLibCall(Node, RTLIB::LDEXP_F32, RTLIB::LDEXP_F64, RTLIB::LDEXP_F80,
                     RTLIB::LDEXP_F128, RTLIB::LDEXP_PPCF128, Results);
     break;
+  case ISD::FMODF:
   case ISD::FFREXP: {
-    RTLIB::Libcall LC = RTLIB::getFREXP(Node->getValueType(0));
+    EVT VT = Node->getValueType(0);
+    RTLIB::Libcall LC = Node->getOpcode() == ISD::FMODF ? RTLIB::getMODF(VT)
+                                                        : RTLIB::getFREXP(VT);
     bool Expanded = DAG.expandMultipleResultFPLibCall(LC, Node, Results,
                                                       /*CallRetResNo=*/0);
     if (!Expanded)
-      llvm_unreachable("Expected scalar FFREXP to expand to libcall!");
+      llvm_unreachable("Expected scalar FFREXP/FMODF to expand to libcall!");
     break;
   }
   case ISD::FPOWI:
@@ -5509,9 +5512,10 @@ void SelectionDAGLegalize::PromoteNode(SDNode *Node) {
     Results.push_back(Tmp2.getValue(1));
     break;
   }
+  case ISD::FMODF:
   case ISD::FSINCOS: {
     Tmp1 = DAG.getNode(ISD::FP_EXTEND, dl, NVT, Node->getOperand(0));
-    Tmp2 = DAG.getNode(ISD::FSINCOS, dl, DAG.getVTList(NVT, NVT), Tmp1,
+    Tmp2 = DAG.getNode(Node->getOpcode(), dl, DAG.getVTList(NVT, NVT), Tmp1,
                        Node->getFlags());
     Tmp3 = DAG.getIntPtrConstant(0, dl, /*isTarget=*/true);
     for (unsigned ResNum = 0; ResNum < Node->getNumValues(); ResNum++)
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 71f100bfa03434..2a4eed1ed527a8 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -2766,10 +2766,10 @@ void DAGTypeLegalizer::PromoteFloatResult(SDNode *N, unsigned ResNo) {
     case ISD::FLDEXP:     R = PromoteFloatRes_ExpOp(N); break;
     case ISD::FFREXP:     R = PromoteFloatRes_FFREXP(N); break;
 
+    case ISD::FMODF:
     case ISD::FSINCOS:
       R = PromoteFloatRes_UnaryWithTwoFPResults(N);
       break;
-
     case ISD::FP_ROUND:   R = PromoteFloatRes_FP_ROUND(N); break;
     case ISD::STRICT_FP_ROUND:
       R = PromoteFloatRes_STRICT_FP_ROUND(N);
@@ -3228,6 +3228,7 @@ void DAGTypeLegalizer::SoftPromoteHalfResult(SDNode *N, unsigned ResNo) {
 
   case ISD::FFREXP:      R = SoftPromoteHalfRes_FFREXP(N); break;
 
+  case ISD::FMODF:
   case ISD::FSINCOS:
     R = SoftPromoteHalfRes_UnaryWithTwoFPResults(N);
     break;
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
index e8404a13009a72..c4e282cf2dad29 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
@@ -454,6 +454,7 @@ SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
   case ISD::UMULO:
   case ISD::FCANONICALIZE:
   case ISD::FFREXP:
+  case ISD::FMODF:
   case ISD::FSINCOS:
   case ISD::SADDSAT:
   case ISD::UADDSAT:
@@ -1205,6 +1206,14 @@ void VectorLegalizer::Expand(SDNode *Node, SmallVectorImpl<SDValue> &Results) {
       return;
     break;
   }
+  case ISD::FMODF: {
+    RTLIB::Libcall LC =
+        RTLIB::getMODF(Node->getValueType(0).getVectorElementType());
+    if (DAG.expandMultipleResultFPLibCall(LC, Node, Results,
+                                          /*CallRetResNo=*/0))
+      return;
+    break;
+  }
   case ISD::VECTOR_COMPRESS:
     Results.push_back(TLI.expandVECTOR_COMPRESS(Node, DAG));
     return;
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 780eba16c9c498..2d8676ffb66e3e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -129,6 +129,7 @@ void DAGTypeLegalizer::ScalarizeVectorResult(SDNode *N, unsigned ResNo) {
   case ISD::ADDRSPACECAST:
     R = ScalarizeVecRes_ADDRSPACECAST(N);
     break;
+  case ISD::FMODF:
   case ISD::FFREXP:
   case ISD::FSINCOS:
     R = ScalarizeVecRes_UnaryOpWithTwoResults(N, ResNo);
@@ -1257,6 +1258,7 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) {
   case ISD::ADDRSPACECAST:
     SplitVecRes_ADDRSPACECAST(N, Lo, Hi);
     break;
+  case ISD::FMODF:
   case ISD::FFREXP:
   case ISD::FSINCOS:
     SplitVecRes_UnaryOpWithTwoResults(N, ResNo, Lo, Hi);
@@ -4779,6 +4781,7 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
   case ISD::VP_FSHR:
     Res = WidenVecRes_Ternary(N);
     break;
+  case ISD::FMODF:
   case ISD::FFREXP:
   case ISD::FSINCOS: {
     if (!unrollExpandedOp())
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index f8d7c3ef7bbe71..036fff7a6cb6e4 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -6994,6 +6994,7 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
                              getValue(I.getArgOperand(0)),
                              getValue(I.getArgOperand(1)), Flags));
     return;
+  case Intrinsic::modf:
   case Intrinsic::sincos:
   case Intrinsic::frexp: {
     unsigned Opcode;
@@ -7003,6 +7004,9 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
     case Intrinsic::sincos:
       Opcode = ISD::FSINCOS;
       break;
+    case Intrinsic::modf:
+      Opcode = ISD::FMODF;
+      break;
     case Intrinsic::frexp:
       Opcode = ISD::FFREXP;
       break;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 580ff19065557b..99ce0eb8a5ee4e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -219,6 +219,7 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
   case ISD::FCOS:                       return "fcos";
   case ISD::STRICT_FCOS:                return "strict_fcos";
   case ISD::FSINCOS:                    return "fsincos";
+  case ISD::FMODF:                      return "fmodf";
   case ISD::FTAN:                       return "ftan";
   case ISD::STRICT_FTAN:                return "strict_ftan";
   case ISD::FASIN:                      return "fasin";
diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp
index 3b0e9c7526fd0a..8a6b66a0ef7f77 100644
--- a/llvm/lib/CodeGen/TargetLoweringBase.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp
@@ -407,6 +407,11 @@ RTLIB::Libcall RTLIB::getFSINCOS(EVT RetVT) {
                       SINCOS_PPCF128);
 }
 
+RTLIB::Libcall RTLIB::getMODF(EVT RetVT) {
+  return getFPLibCall(RetVT, MODF_F32, MODF_F64, MODF_F80, MODF_F128,
+                      MODF_PPCF128);
+}
+
 RTLIB::Libcall RTLIB::getOutlineAtomicHelper(const Libcall (&LC)[5][4],
                                              AtomicOrdering Order,
                                              uint64_t MemSize) {
@@ -775,9 +780,9 @@ void TargetLoweringBase::initActions() {
     setOperationAction({ISD::BITREVERSE, ISD::PARITY}, VT, Expand);
 
     // These library functions default to expand.
-    setOperationAction(
-        {ISD::FROUND, ISD::FPOWI, ISD::FLDEXP, ISD::FFREXP, ISD::FSINCOS}, VT,
-        Expand);
+    setOperationAction({ISD::FROUND, ISD::FPOWI, ISD::FLDEXP, ISD::FFREXP,
+                        ISD::FSINCOS, ISD::FMODF},
+                       VT, Expand);
 
     // These operations default to expand for vector types.
     if (VT.isVector())
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e35ad524885015..86627929fc4336 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -732,19 +732,19 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
     setOperationAction(ISD::FCOPYSIGN, MVT::bf16, Promote);
   }
 
-  for (auto Op : {ISD::FREM,          ISD::FPOW,          ISD::FPOWI,
-                  ISD::FCOS,          ISD::FSIN,          ISD::FSINCOS,
-                  ISD::FACOS,         ISD::FASIN,         ISD::FATAN,
-                  ISD::FATAN2,        ISD::FCOSH,         ISD::FSINH,
-                  ISD::FTANH,         ISD::FTAN,          ISD::FEXP,
-                  ISD::FEXP2,         ISD::FEXP10,        ISD::FLOG,
-                  ISD::FLOG2,         ISD::FLOG10,        ISD::STRICT_FREM,
-                  ISD::STRICT_FPOW,   ISD::STRICT_FPOWI,  ISD::STRICT_FCOS,
-                  ISD::STRICT_FSIN,   ISD::STRICT_FACOS,  ISD::STRICT_FASIN,
-                  ISD::STRICT_FATAN,  ISD::STRICT_FATAN2, ISD::STRICT_FCOSH,
-                  ISD::STRICT_FSINH,  ISD::STRICT_FTANH,  ISD::STRICT_FEXP,
-                  ISD::STRICT_FEXP2,  ISD::STRICT_FLOG,   ISD::STRICT_FLOG2,
-                  ISD::STRICT_FLOG10, ISD::STRICT_FTAN}) {
+  for (auto Op : {ISD::FREM,         ISD::FPOW,          ISD::FPOWI,
+                  ISD::FCOS,         ISD::FSIN,          ISD::FSINCOS,
+                  ISD::FMODF,        ISD::FACOS,         ISD::FASIN,
+                  ISD::FATAN,        ISD::FATAN2,        ISD::FCOSH,
+                  ISD::FSINH,        ISD::FTANH,         ISD::FTAN,
+                  ISD::FEXP,         ISD::FEXP2,         ISD::FEXP10,
+                  ISD::FLOG,         ISD::FLOG2,         ISD::FLOG10,
+                  ISD::STRICT_FREM,  ISD::STRICT_FPOW,   ISD::STRICT_FPOWI,
+                  ISD::STRICT_FCOS,  ISD::STRICT_FSIN,   ISD::STRICT_FACOS,
+                  ISD::STRICT_FASIN, ISD::STRICT_FATAN,  ISD::STRICT_FATAN2,
+                  ISD::STRICT_FCOSH, ISD::STRICT_FSINH,  ISD::STRICT_FTANH,
+                  ISD::STRICT_FEXP,  ISD::STRICT_FEXP2,  ISD::STRICT_FLOG,
+                  ISD::STRICT_FLOG2, ISD::STRICT_FLOG10, ISD::STRICT_FTAN}) {
     setOperationAction(Op, MVT::f16, Promote);
     setOperationAction(Op, MVT::v4f16, Expand);
     setOperationAction(Op, MVT::v8f16, Expand);
diff --git a/llvm/test/CodeGen/AArch64/llvm.modf.ll b/llvm/test/CodeGen/AArch64/llvm.modf.ll
new file mode 100644
index 00000000000000..41fe796daca86c
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/llvm.modf.ll
@@ -0,0 +1,255 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple=aarch64-gnu-linux < %s | FileCheck -check-prefixes=CHECK %s
+
+define { half, half } @test_modf_f16(half %a) {
+; CHECK-LABEL: test_modf_f16:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    fcvt s0, h0
+; CHECK-NEXT:    add x0, sp, #12
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    ldr s1, [sp, #12]
+; CHECK-NEXT:    fcvt h0, s0
+; CHECK-NEXT:    fcvt h1, s1
+; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+  %result = call { half, half } @llvm.modf.f16(half %a)
+  ret { half, half } %result
+}
+
+define half @test_modf_f16_only_use_fractional_part(half %a) {
+; CHECK-LABEL: test_modf_f16_only_use_fractional_part:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    fcvt s0, h0
+; CHECK-NEXT:    add x0, sp, #12
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    fcvt h0, s0
+; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+  %result = call { half, half } @llvm.modf.f16(half %a)
+  %result.0 = extractvalue { half, half } %result, 0
+  ret half %result.0
+}
+
+define half @test_modf_f16_only_use_integral_part(half %a) {
+; CHECK-LABEL: test_modf_f16_only_use_integral_part:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    fcvt s0, h0
+; CHECK-NEXT:    add x0, sp, #12
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    ldr s0, [sp, #12]
+; CHECK-NEXT:    fcvt h0, s0
+; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+  %result = call { half, half } @llvm.modf.f16(half %a)
+  %result.1 = extractvalue { half, half } %result, 1
+  ret half %result.1
+}
+
+define { <2 x half>, <2 x half> } @test_modf_v2f16(<2 x half> %a) {
+; CHECK-LABEL: test_modf_v2f16:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    sub sp, sp, #64
+; CHECK-NEXT:    str x30, [sp, #48] // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 64
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    // kill: def $d0 killed $d0 def $q0
+; CHECK-NEXT:    mov h1, v0.h[1]
+; CHECK-NEXT:    str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT:    add x0, sp, #44
+; CHECK-NEXT:    fcvt s0, h1
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    ldr q1, [sp] // 16-byte Folded Reload
+; CHECK-NEXT:    fcvt h0, s0
+; CHECK-NEXT:    add x0, sp, #40
+; CHECK-NEXT:    fcvt s1, h1
+; CHECK-NEXT:    str q0, [sp, #16] // 16-byte Folded Spill
+; CHECK-NEXT:    fmov s0, s1
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    ldr q1, [sp] // 16-byte Folded Reload
+; CHECK-NEXT:    fcvt h2, s0
+; CHECK-NEXT:    add x0, sp, #56
+; CHECK-NEXT:    mov h1, v1.h[2]
+; CHECK-NEXT:    fcvt s0, h1
+; CHECK-NEXT:    ldr q1, [sp, #16] // 16-byte Folded Reload
+; CHECK-NEXT:    mov v2.h[1], v1.h[0]
+; CHECK-NEXT:    str q2, [sp, #16] // 16-byte Folded Spill
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    ldr q1, [sp] // 16-byte Folded Reload
+; CHECK-NEXT:    fcvt h2, s0
+; CHECK-NEXT:    add x0, sp, #60
+; CHECK-NEXT:    mov h1, v1.h[3]
+; CHECK-NEXT:    fcvt s0, h1
+; CHECK-NEXT:    ldr q1, [sp, #16] // 16-byte Folded Reload
+; CHECK-NEXT:    mov v1.h[2], v2.h[0]
+; CHECK-NEXT:    str q1, [sp, #16] // 16-byte Folded Spill
+; CHECK-NEXT:    bl modff
+; CHECK-NEXT:    ldp s2, s1, [sp, #40]
+; CHECK-NEXT:    fcvt h4, s0
+; CHECK-NEXT:    ldr q0, [sp, #16] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr x30, [sp, #48] // 8-byte Folded Reload
+; CHECK-NEXT:    fcvt h3, s1
+; CHECK-NEXT:    fcvt h1, s2
+; CHECK-NEXT:    ldr s2, [sp, #56]
+; CHECK-NEXT:    mov v0.h[3], v4.h[0]
+; CHECK-NEXT:    fcvt h2, s2
+; CHECK-NEXT:    // kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT:    mov v1.h[1], v3.h[0]
+; CHECK-NEXT:    ldr s3, [sp, #60]
+; CHECK-NEXT:    mov v1.h[2], v2.h[0]
+; CHECK-NEXT:    fcvt h2, s3
+; CHECK-NEXT:    mov v1.h[3], v2.h[0]
+; CHECK-NEXT:    // kill: def $d1 killed $d1 killed $q1
+; CHECK-NEXT:    add sp, sp, #64
+; CHECK-NEXT:    ret
+  %result = call { <2 x half>, <2 x half> } @llvm.modf.v2f16(<2 x half> %a)
+  ret { <2 x half>, <2 x half> } %result
+}
+
+define { float, float } @test_modf_f32(float %a) {
+; CHECK-LABEL: test_modf_f32:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    add x0, sp, #12
+; CHECK-NEX...
[truncated]

llvm/docs/LangRef.rst

This adds the `llvm.modf` intrinsic, legalization, and lowering. The `llvm.modf` intrinsic takes a floating-point value and returns both the integral and fractional parts (as a struct). ``` declare { float, float } @llvm.modf.f32(float %Val) declare { double, double } @llvm.modf.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.modf.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.modf.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.modf.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float> %Val) ``` This corresponds to the libm `modf` function but returns multiple values in a struct (rather than take output pointers), which makes it easier to vectorize.

llvm/docs/LangRef.rst

paulwalker-arm

Sorry, I've been reviewing this PR on and off throughout the day and had some latent suggestions. Otherwise I think this PR is good to go.

llvm/include/llvm/CodeGen/ISDOpcodes.h

paulwalker-arm · 2025-02-06T14:36:52Z

llvm/test/CodeGen/AArch64/veclib-llvm.modf.ll

+; SLEEF-LABEL: test_modf_v4f32:
+; SLEEF:    bl _ZGVnN4vl4_modff
+;
+; ARMPL-LABEL: test_modf_v4f32:
+; ARMPL:    bl armpl_vmodfq_f32


I'd rather see the full output here because the code goes to some trouble to ensure the IR stores are merged with the library call and so it seems prudent to verify that happens. Perhaps worth having some negative tests (e.g. where %out_integral is loaded from after the call but before the IR store) unless that scenario is already covered elsewhere?

Sure, I've removed the filtering. I'll add a few negative tests later on (I think there's some already for sincos/frexp for the store folding, but possibly not testing vectors too).

Awesome. Thanks.

I've added modf_store_merging_load_before_store() and you can see the behaviour is correct (the load is before the call, and the value is saved around the call).

llvm/include/llvm/CodeGen/ISDOpcodes.h

paulwalker-arm · 2025-02-06T15:00:57Z

llvm/test/CodeGen/AArch64/veclib-llvm.modf.ll

+; SLEEF-LABEL: test_modf_v4f32:
+; SLEEF:    bl _ZGVnN4vl4_modff
+;
+; ARMPL-LABEL: test_modf_v4f32:
+; ARMPL:    bl armpl_vmodfq_f32


Awesome. Thanks.

This adds the `llvm.modf` intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp). The `llvm.modf` intrinsic takes a floating-point value and returns both the integral and fractional parts (as a struct). ``` declare { float, float } @llvm.modf.f32(float %Val) declare { double, double } @llvm.modf.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.modf.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.modf.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.modf.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float> %Val) ``` This corresponds to the libm `modf` function but returns multiple values in a struct (rather than take output pointers), which makes it easier to vectorize.

MacDue requested review from arsenm, RKSimon and sdesmalen-arm January 7, 2025 14:56

llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well llvm:ir labels Jan 7, 2025

arsenm reviewed Jan 7, 2025

View reviewed changes

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved

MacDue added 2 commits February 5, 2025 10:48

Update docs

d5597d6

MacDue force-pushed the llvm_modf branch from ae79135 to d5597d6 Compare February 5, 2025 14:34

MacDue requested review from SamTebbs33 and huntergr-arm February 5, 2025 14:36

mgabka requested a review from paulwalker-arm February 5, 2025 15:09

MacDue mentioned this pull request Feb 5, 2025

[IR] Add llvm.sincospi intrinsic #125873

Merged

SamTebbs33 approved these changes Feb 6, 2025

View reviewed changes

SamTebbs33 reviewed Feb 6, 2025

View reviewed changes

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved

paulwalker-arm reviewed Feb 6, 2025

View reviewed changes

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved

paulwalker-arm reviewed Feb 6, 2025

View reviewed changes

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved

Fixups

6878e5b

paulwalker-arm reviewed Feb 6, 2025

View reviewed changes

Fixups

229a791

paulwalker-arm approved these changes Feb 6, 2025

View reviewed changes

Fixups

9b53845

MacDue merged commit 4bf97aa into llvm:main Feb 7, 2025
9 checks passed

MacDue deleted the llvm_modf branch February 7, 2025 09:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[IR] Add `llvm.modf` intrinsic #121948

[IR] Add `llvm.modf` intrinsic #121948

Uh oh!

MacDue commented Jan 7, 2025

Uh oh!

llvmbot commented Jan 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulwalker-arm left a comment

Uh oh!

Uh oh!

paulwalker-arm Feb 6, 2025

Uh oh!

MacDue Feb 6, 2025

Uh oh!

paulwalker-arm Feb 6, 2025

Uh oh!

MacDue Feb 6, 2025

Uh oh!

Uh oh!

paulwalker-arm Feb 6, 2025

Uh oh!

Uh oh!

Uh oh!

[IR] Add llvm.modf intrinsic #121948

[IR] Add llvm.modf intrinsic #121948

Uh oh!

Conversation

MacDue commented Jan 7, 2025

Uh oh!

llvmbot commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulwalker-arm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

paulwalker-arm Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

MacDue Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

paulwalker-arm Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

MacDue Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

paulwalker-arm Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[IR] Add `llvm.modf` intrinsic #121948

[IR] Add `llvm.modf` intrinsic #121948

llvmbot commented Jan 7, 2025 •

edited

Loading