Skip to content

Commit 858bea8

Browse files
[LangRef] Adjust the documentation of some fast-math flags. (#99557)
The first change is the clarification of rewrite-based semantics, and the fact that when doing the rewrite, all of the instructions involved need to have the rewrite. This is not a change in semantics: there is wide agreement that this behavior is true for most flags. But it is necessary to clarify this, and also clarify that there is a fundamental difference between a flag like `nnan` and a flag like `contract`. Note that several InstCombine transforms do not correctly check this behavior at the moment. The second change is a specific clarification of the rewrites performed by arcp. These rewrites capture what is necessary to enable the transformations that currently require just arcp, none of which are using the flag incorrectly right now.
1 parent 90617e9 commit 858bea8

File tree

1 file changed

+48
-5
lines changed

1 file changed

+48
-5
lines changed

llvm/docs/LangRef.rst

Lines changed: 48 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3669,6 +3669,10 @@ LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
36693669
may use the following flags to enable otherwise unsafe
36703670
floating-point transformations.
36713671

3672+
``fast``
3673+
This flag is a shorthand for specifying all fast-math flags at once, and
3674+
imparts no additional semantics from using all of them.
3675+
36723676
``nnan``
36733677
No NaNs - Allow optimizations to assume the arguments and result are not
36743678
NaN. If an argument is a nan, or the result would be a nan, it produces
@@ -3684,9 +3688,51 @@ floating-point transformations.
36843688
argument or zero result as insignificant. This does not imply that -0.0
36853689
is poison and/or guaranteed to not exist in the operation.
36863690

3691+
Rewrite-based flags
3692+
^^^^^^^^^^^^^^^^^^^
3693+
3694+
The following flags have rewrite-based semantics. These flags allow expressions,
3695+
potentially containing multiple non-consecutive instructions, to be rewritten
3696+
into alternative instructions. When multiple instructions are involved in an
3697+
expression, it is necessary that all of the instructions have the necessary
3698+
rewrite-based flag present on them, and the rewritten instructions will
3699+
generally have the intersection of the flags present on the input instruction.
3700+
3701+
In the following example, the floating-point expression in the body of ``@orig``
3702+
has ``contract`` and ``reassoc`` in common, and thus if it is rewritten into the
3703+
expression in the body of ``@target``, all of the new instructions get those two
3704+
flags and only those flags as a result. Since the ``arcp`` is present on only
3705+
one of the instructions in the expression, it is not present in the transformed
3706+
expression. Furthermore, this reassociation here is only legal because both the
3707+
instructions had the ``reassoc`` flag; if only one had it, it would not be legal
3708+
to make the transformation.
3709+
3710+
.. code-block:: llvm
3711+
3712+
define double @orig(double %a, double %b, double %c) {
3713+
%t1 = fmul contract reassoc double %a, %b
3714+
%val = fmul contract reassoc arcp double %t1, %c
3715+
ret double %val
3716+
}
3717+
3718+
define double @target(double %a, double %b, double %c) {
3719+
%t1 = fmul contract reassoc double %b, %c
3720+
%val = fmul contract reassoc double %a, %t1
3721+
ret double %val
3722+
}
3723+
3724+
These rules do not apply to the other fast-math flags. Whether or not a flag
3725+
like ``nnan`` is present on any or all of the rewritten instructions is based
3726+
on whether or not it is possible for said instruction to have a NaN input or
3727+
output, given the original flags.
3728+
36873729
``arcp``
3688-
Allow Reciprocal - Allow optimizations to use the reciprocal of an
3689-
argument rather than perform division.
3730+
Allows division to be treated as a multiplication by a reciprocal.
3731+
Specifically, this permits ``a / b`` to be considered equivalent to
3732+
``a * (1.0 / b)`` (which may subsequently be susceptible to code motion),
3733+
and it also permits ``a / (b / c)`` to be considered equivalent to
3734+
``a * (c / b)``. Both of these rewrites can be applied in either direction:
3735+
``a * (c / b)`` can be rewritten into ``a / (b / c)``.
36903736

36913737
``contract``
36923738
Allow floating-point contraction (e.g. fusing a multiply followed by an
@@ -3705,9 +3751,6 @@ floating-point transformations.
37053751
Allow reassociation transformations for floating-point instructions.
37063752
This may dramatically change results in floating-point.
37073753

3708-
``fast``
3709-
This flag implies all of the others.
3710-
37113754
.. _uselistorder:
37123755

37133756
Use-list Order Directives

0 commit comments

Comments
 (0)