-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstCombine] Eliminate fptrunc/fpext if fast math flags allow it #115027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think
contract
is suitable here. Previously,contract
means the optimizer is allowed to fold fmul+fadd into a fma. But the transformation here is too aggressive since clang may be the first compiler to fold this pattern: https://godbolt.org/z/bYfz334PrThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LLVM LangRef is a bit unhelpful as to what the contract flag allows, as it just says "Allow floating-point contraction" without defining what it means by contraction. The definition of contraction in the C23 standard (in section 6.5.1) is clearer:
with a footnote saying that in a contracted expression the intermediate operations are as if evaluated to infinite range and precision. If the expression
(double)(float)x
(where x is a double) is contracted then the double-to-float and float-to-double operations are evaluated as if to infinite range and precision, meaning the result is just x.So long as what clang is doing is correct I don't think it matters that it's the first compiler to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not comfortable with using
contract
in this way. I can see your point that this meets the C23 definition for FP_CONTRACT, but I don't think it's what users would expect. My feeling is that users probably associate the contract flag with FMA. The C23 standard extends the definition, presumably in anticipation of similar hardware instructions that are able to fuse other combinations of operations, but the transformation proposed in this PR is most likely circumventing an explicit user instruction intended to truncate a value.In order to get the 'contract' flag in isolation from clang, for example, you'd need to use the
-ffp-contract=fast
option. The documentation for this option says, "Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA)." There's nothing there that indicates it will allow the compiler to disregard changes in precision implied (or explicitly requested) by the source code. If I were using this option, my intention would be to enable FMA formation across expressions. I would not be happy if the compiler changed my results in other ways.This gets us into a problematic situation. I think we'd all agree that the full set of fast-math flags should enable this transformation. However, there isn't a flag for general value-changing optimizations, so if we don't allow this with
contract
we're going to have to either look at the "unsafe-fp-math" function attribute or do something more or less arbitrary in requiring some other combination of flags.@jcranmer-intel What's your opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got an RFC on the
contract
semantics almost ready to go (let's see if I'm successful in getting it published this week). The working definition I have, that generalizes it from just FMA formation, is essentially "new expression would evaluate the same result if both old and newer were evaluated at infinite range/precision, and the new expression only has at most one instruction that causes rounding" (a few other conditions, but that's the main one).The main goal is to generalize to cover cases like
fms
instructions (a * b - c
), but when I was reviewing a lot of the documentation in C for#pragma STDC FP_CONTRACT
, I've also found thatfpext; libm; fptrunc
is another case that seems to be contemplated forFP_CONTRACT
.By this definition of
contract
, then this optimization would be correct. But I'll also admit to having a vague level of unease about this being incontract
--round-tripping via a smaller value tends to feel intentional, so removing it should have some stronger level of intent. Of all of the FMFs,contract
is also the flag that is probably the most likely for users to default to enable, so it helps to be a little less aggressive in the optimizations here.Stepping back a touch: I've increasingly come to the opinion that FMA formation via "combine add/mul into fma if some flag is present" isn't the best way to tackle optimization. It's better to instead have a "fast_fma" primitive, which does fma if there's a hardware instruction for it, and add/mul if there isn't. This is ultimately something that requires source changes, even language changes, so we may be screwed out of a path for this for C/C++, though.
If people want it, I can bring up this topic to the CFP study group.