[Driver] Make ffp-model=fast honor non-finite-values, introduce ffp-model=aggressive (#100453)

Andy Kaylor · web-flow · commit 27e5f505e5ee · 2024-08-20T07:11:29.000-07:00
This change modifies -ffp-model=fast to select options that more closely match -funsafe-math-optimizations, and introduces a new model, -ffp-model=aggressive which matches the existing behavior (except for a minor change in the fp-contract behavior). The primary motivation for this change is to make -ffp-model=fast more user friendly, particularly in light of LLVM's aggressive optimizations when -fno-honor-nans and -fno-honor-infinites are used. This was previously proposed here: https://discourse.llvm.org/t/making-ffp-model-fast-more-user-friendly/78402
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
@@ -176,6 +176,14 @@ Modified Compiler Flags
 
 - The compiler flag `-fbracket-depth` default value is increased from 256 to 2048.
 
+- The ``-ffp-model`` option has been updated to enable a more limited set of
+  optimizations when the ``fast`` argument is used and to accept a new argument,
+  ``aggressive``. The behavior of ``-ffp-model=aggressive`` is equivalent
+  to the previous behavior of ``-ffp-model=fast``. The updated
+  ``-ffp-model=fast`` behavior no longer assumes finite math only and uses
+  the ``promoted`` algorithm for complex division when possible rather than the
+  less basic (limited range) algorithm.
+
 Removed Compiler Flags
 -------------------------
 
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
@@ -1452,28 +1452,30 @@ describes the various floating point semantic modes and the corresponding option
   "fhonor-infinities", "{on, off}"
   "fsigned-zeros", "{on, off}"
   "freciprocal-math", "{on, off}"
-  "allow_approximate_fns", "{on, off}"
+  "fallow-approximate-fns", "{on, off}"
   "fassociative-math", "{on, off}"
+  "fcomplex-arithmetic", "{basic, improved, full, promoted}"
 
 This table describes the option settings that correspond to the three
 floating point semantic models: precise (the default), strict, and fast.
 
 
 .. csv-table:: Floating Point Models
-  :header: "Mode", "Precise", "Strict", "Fast"
-  :widths: 25, 15, 15, 15
-
-  "except_behavior", "ignore", "strict", "ignore"
-  "fenv_access", "off", "on", "off"
-  "rounding_mode", "tonearest", "dynamic", "tonearest"
-  "contract", "on", "off", "fast"
-  "support_math_errno", "on", "on", "off"
-  "no_honor_nans", "off", "off", "on"
-  "no_honor_infinities", "off", "off", "on"
-  "no_signed_zeros", "off", "off", "on"
-  "allow_reciprocal", "off", "off", "on"
-  "allow_approximate_fns", "off", "off", "on"
-  "allow_reassociation", "off", "off", "on"
+  :header: "Mode", "Precise", "Strict", "Fast", "Aggressive"
+  :widths: 25, 25, 25, 25, 25
+
+  "except_behavior", "ignore", "strict", "ignore", "ignore"
+  "fenv_access", "off", "on", "off", "off"
+  "rounding_mode", "tonearest", "dynamic", "tonearest", "tonearest"
+  "contract", "on", "off", "fast", "fast"
+  "support_math_errno", "on", "on", "off", "off"
+  "no_honor_nans", "off", "off", "off", "on"
+  "no_honor_infinities", "off", "off", "off", "on"
+  "no_signed_zeros", "off", "off", "on", "on"
+  "allow_reciprocal", "off", "off", "on", "on"
+  "allow_approximate_fns", "off", "off", "on", "on"
+  "allow_reassociation", "off", "off", "on", "on"
+  "complex_arithmetic", "full", "full", "promoted", "basic"
 
 The ``-ffp-model`` option does not modify the ``fdenormal-fp-math``
 setting, but it does have an impact on whether ``crtfastmath.o`` is
@@ -1492,9 +1494,9 @@ for more details.
    * Floating-point math obeys regular algebraic rules for real numbers (e.g.
      ``+`` and ``*`` are associative, ``x/y == x * (1/y)``, and
      ``(a + b) * c == a * c + b * c``),
-   * Operands to floating-point operations are not equal to ``NaN`` and
-     ``Inf``, and
-   * ``+0`` and ``-0`` are interchangeable.
+   * No ``NaN`` or infinite values will be operands or results of
+     floating-point operations,
+   * ``+0`` and ``-0`` may be treated as interchangeable.
 
    ``-ffast-math`` also defines the ``__FAST_MATH__`` preprocessor
    macro. Some math libraries recognize this macro and change their behavior.
@@ -1753,7 +1755,7 @@ for more details.
    Specify floating point behavior. ``-ffp-model`` is an umbrella
    option that encompasses functionality provided by other, single
    purpose, floating point options.  Valid values are: ``precise``, ``strict``,
-   and ``fast``.
+   ``fast``, and ``aggressive``.
    Details:
 
    * ``precise`` Disables optimizations that are not value-safe on
@@ -1766,7 +1768,10 @@ for more details.
      ``STDC FENV_ACCESS``: by default ``FENV_ACCESS`` is disabled. This option
      setting behaves as though ``#pragma STDC FENV_ACCESS ON`` appeared at the
      top of the source file.
-   * ``fast`` Behaves identically to specifying both ``-ffast-math`` and
+   * ``fast`` Behaves identically to specifying ``-funsafe-math-optimizations``,
+     ``-fno-math-errno`` and ``-fcomplex-arithmetic=promoted``
+     ``ffp-contract=fast``
+   * ``aggressive`` Behaves identically to specifying both ``-ffast-math`` and
      ``ffp-contract=fast``
 
    Note: If your command line specifies multiple instances
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
@@ -1366,7 +1366,7 @@ bool ToolChain::isFastMathRuntimeAvailable(const ArgList &Args,
       Default = false;
     if (A && A->getOption().getID() == options::OPT_ffp_model_EQ) {
       StringRef Model = A->getValue();
-      if (Model != "fast")
+      if (Model != "fast" && Model != "aggressive")
         Default = false;
     }
   }
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2885,10 +2885,29 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
   std::string ComplexRangeStr = "";
   std::string GccRangeComplexOption = "";
 
+  auto setComplexRange = [&](LangOptions::ComplexRangeKind NewRange) {
+    // Warn if user expects to perform full implementation of complex
+    // multiplication or division in the presence of nnan or ninf flags.
+    if (Range != NewRange)
+      EmitComplexRangeDiag(D,
+                           !GccRangeComplexOption.empty()
+                               ? GccRangeComplexOption
+                               : ComplexArithmeticStr(Range),
+                           ComplexArithmeticStr(NewRange));
+    Range = NewRange;
+  };
+
   // Lambda to set fast-math options. This is also used by -ffp-model=fast
-  auto applyFastMath = [&]() {
-    HonorINFs = false;
-    HonorNaNs = false;
+  auto applyFastMath = [&](bool Aggressive) {
+    if (Aggressive) {
+      HonorINFs = false;
+      HonorNaNs = false;
+      setComplexRange(LangOptions::ComplexRangeKind::CX_Basic);
+    } else {
+      HonorINFs = true;
+      HonorNaNs = true;
+      setComplexRange(LangOptions::ComplexRangeKind::CX_Promoted);
+    }
     MathErrno = false;
     AssociativeMath = true;
     ReciprocalMath = true;
@@ -2897,21 +2916,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
     TrappingMath = false;
     RoundingFPMath = false;
     FPExceptionBehavior = "";
-    // If fast-math is set then set the fp-contract mode to fast.
     FPContract = "fast";
-    // ffast-math enables basic range rules for complex multiplication and
-    // division.
-    // Warn if user expects to perform full implementation of complex
-    // multiplication or division in the presence of nan or ninf flags.
-    if (Range == LangOptions::ComplexRangeKind::CX_Full ||
-        Range == LangOptions::ComplexRangeKind::CX_Improved ||
-        Range == LangOptions::ComplexRangeKind::CX_Promoted)
-      EmitComplexRangeDiag(
-          D, ComplexArithmeticStr(Range),
-          !GccRangeComplexOption.empty()
-              ? GccRangeComplexOption
-              : ComplexArithmeticStr(LangOptions::ComplexRangeKind::CX_Basic));
-    Range = LangOptions::ComplexRangeKind::CX_Basic;
     SeenUnsafeMathModeOption = true;
   };
 
@@ -3039,8 +3044,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
       SignedZeros = true;
 
       StringRef Val = A->getValue();
-      if (OFastEnabled && Val != "fast") {
-        // Only -ffp-model=fast is compatible with OFast, ignore.
+      if (OFastEnabled && Val != "aggressive") {
+        // Only -ffp-model=aggressive is compatible with -OFast, ignore.
         D.Diag(clang::diag::warn_drv_overriding_option)
             << Args.MakeArgString("-ffp-model=" + Val) << "-Ofast";
         break;
@@ -3052,13 +3057,19 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
             << Args.MakeArgString("-ffp-model=" + Val);
       if (Val == "fast") {
         FPModel = Val;
-        applyFastMath();
+        applyFastMath(false);
         // applyFastMath sets fp-contract="fast"
         LastFpContractOverrideOption = "-ffp-model=fast";
+      } else if (Val == "aggressive") {
+        FPModel = Val;
+        applyFastMath(true);
+        // applyFastMath sets fp-contract="fast"
+        LastFpContractOverrideOption = "-ffp-model=aggressive";
       } else if (Val == "precise") {
         FPModel = Val;
         FPContract = "on";
         LastFpContractOverrideOption = "-ffp-model=precise";
+        setComplexRange(LangOptions::ComplexRangeKind::CX_Full);
       } else if (Val == "strict") {
         StrictFPModel = true;
         FPExceptionBehavior = "strict";
@@ -3067,6 +3078,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
         LastFpContractOverrideOption = "-ffp-model=strict";
         TrappingMath = true;
         RoundingFPMath = true;
+        setComplexRange(LangOptions::ComplexRangeKind::CX_Full);
       } else
         D.Diag(diag::err_drv_unsupported_option_argument)
             << A->getSpelling() << Val;
@@ -3247,7 +3259,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
         continue;
       [[fallthrough]];
     case options::OPT_ffast_math:
-      applyFastMath();
+      applyFastMath(true);
       if (A->getOption().getID() == options::OPT_Ofast)
         LastFpContractOverrideOption = "-Ofast";
       else
diff --git a/clang/test/CodeGen/ffp-model.c b/clang/test/CodeGen/ffp-model.c
@@ -3,6 +3,9 @@
 // RUN: %clang -S -emit-llvm -fenable-matrix -ffp-model=fast %s -o - \
 // RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-FAST
 
+// RUN: %clang -S -emit-llvm -fenable-matrix -ffp-model=aggressive %s -o - \
+// RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-AGGRESSIVE
+
 // RUN: %clang -S -emit-llvm -fenable-matrix -ffp-model=precise %s -o - \
 // RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-PRECISE
 
@@ -20,9 +23,13 @@ float mymuladd(float x, float y, float z) {
   // CHECK: define{{.*}} float @mymuladd
   return x * y + z;
 
-  // CHECK-FAST: fmul fast float
+  // CHECK-AGGRESSIVE: fmul fast float
+  // CHECK-AGGRESSIVE: load float, ptr
+  // CHECK-AGGRESSIVE: fadd fast float
+
+  // CHECK-FAST: fmul reassoc nsz arcp contract afn float
   // CHECK-FAST: load float, ptr
-  // CHECK-FAST: fadd fast float
+  // CHECK-FAST: fadd reassoc nsz arcp contract afn float
 
   // CHECK-PRECISE: load float, ptr
   // CHECK-PRECISE: load float, ptr
@@ -54,9 +61,13 @@ void my_vec_muladd(v2f x, float y, v2f z, v2f *res) {
   // CHECK: define{{.*}}@my_vec_muladd
   *res = x * y + z;
 
-  // CHECK-FAST: fmul fast <2 x float>
+  // CHECK-AGGRESSIVE: fmul fast <2 x float>
+  // CHECK-AGGRESSIVE: load <2 x float>, ptr
+  // CHECK-AGGRESSIVE: fadd fast <2 x float>
+
+  // CHECK-FAST: fmul reassoc nsz arcp contract afn <2 x float>
   // CHECK-FAST: load <2 x float>, ptr
-  // CHECK-FAST: fadd fast <2 x float>
+  // CHECK-FAST: fadd reassoc nsz arcp contract afn <2 x float>
 
   // CHECK-PRECISE: load <2 x float>, ptr
   // CHECK-PRECISE: load float, ptr
@@ -88,9 +99,13 @@ void my_m21_muladd(m21f x, float y, m21f z, m21f *res) {
   // CHECK: define{{.*}}@my_m21_muladd
   *res = x * y + z;
 
-  // CHECK-FAST: fmul fast <2 x float>
+  // CHECK-AGGRESSIVE: fmul fast <2 x float>
+  // CHECK-AGGRESSIVE: load <2 x float>, ptr
+  // CHECK-AGGRESSIVE: fadd fast <2 x float>
+
+  // CHECK-FAST: fmul reassoc nsz arcp contract afn <2 x float>
   // CHECK-FAST: load <2 x float>, ptr
-  // CHECK-FAST: fadd fast <2 x float>
+  // CHECK-FAST: fadd reassoc nsz arcp contract afn <2 x float>
 
   // CHECK-PRECISE: load <2 x float>, ptr
   // CHECK-PRECISE: load float, ptr
@@ -122,9 +137,13 @@ void my_m22_muladd(m22f x, float y, m22f z, m22f *res) {
   // CHECK: define{{.*}}@my_m22_muladd
   *res = x * y + z;
 
-  // CHECK-FAST: fmul fast <4 x float>
+  // CHECK-AGGRESSIVE: fmul fast <4 x float>
+  // CHECK-AGGRESSIVE: load <4 x float>, ptr
+  // CHECK-AGGRESSIVE: fadd fast <4 x float>
+
+  // CHECK-FAST: fmul reassoc nsz arcp contract afn <4 x float>
   // CHECK-FAST: load <4 x float>, ptr
-  // CHECK-FAST: fadd fast <4 x float>
+  // CHECK-FAST: fadd reassoc nsz arcp contract afn <4 x float>
 
   // CHECK-PRECISE: load <4 x float>, ptr
   // CHECK-PRECISE: load float, ptr
diff --git a/clang/test/Driver/fp-model.c b/clang/test/Driver/fp-model.c

Original file line number	Diff line number	Diff line change
`@@ -1366,7 +1366,7 @@ bool ToolChain::isFastMathRuntimeAvailable(const ArgList &Args,`
`1366`	`1366`	`Default = false;`
`1367`	`1367`	`if (A && A->getOption().getID() == options::OPT_ffp_model_EQ) {`
`1368`	`1368`	`StringRef Model = A->getValue();`
`1369`		`- if (Model != "fast")`
	`1369`	`+ if (Model != "fast" && Model != "aggressive")`
`1370`	`1370`	`Default = false;`
`1371`	`1371`	`}`
`1372`	`1372`	`}`