-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAGCombiner] Eliminate fp casts if we have the right fast math flags #131345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18455,7 +18455,45 @@ SDValue DAGCombiner::visitFP_ROUND(SDNode *N) { | |
return SDValue(); | ||
} | ||
|
||
// Eliminate a floating-point widening of a narrowed value if the fast math | ||
// flags allow it. | ||
static SDValue eliminateFPCastPair(SDNode *N) { | ||
SDValue N0 = N->getOperand(0); | ||
EVT VT = N->getValueType(0); | ||
|
||
unsigned NarrowingOp; | ||
switch (N->getOpcode()) { | ||
case ISD::FP16_TO_FP: | ||
NarrowingOp = ISD::FP_TO_FP16; | ||
break; | ||
case ISD::BF16_TO_FP: | ||
NarrowingOp = ISD::FP_TO_BF16; | ||
break; | ||
case ISD::FP_EXTEND: | ||
NarrowingOp = ISD::FP_ROUND; | ||
break; | ||
default: | ||
llvm_unreachable("Expected widening FP cast"); | ||
} | ||
Comment on lines
+18464
to
+18477
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should not be a switch/assign/break. Surely there's a utility function for this somewhere already? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There doesn't seem to be one as far as I can tell. I could move this out into a separate function like e.g. ISD::getInverseMinMaxOpcode, but that would just be moving the switch to a separate location. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could change each call site to pass in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That then means |
||
|
||
if (N0.getOpcode() == NarrowingOp && N0.getOperand(0).getValueType() == VT) { | ||
const SDNodeFlags NarrowFlags = N0->getFlags(); | ||
const SDNodeFlags WidenFlags = N->getFlags(); | ||
// Narrowing can introduce inf and change the encoding of a nan, so the | ||
// widen must have the nnan and ninf flags to indicate that we don't need to | ||
// care about that. We are also removing a rounding step, and that requires | ||
// both the narrow and widen to allow contraction. | ||
if (WidenFlags.hasNoNaNs() && WidenFlags.hasNoInfs() && | ||
NarrowFlags.hasAllowContract() && WidenFlags.hasAllowContract()) { | ||
return N0.getOperand(0); | ||
} | ||
} | ||
|
||
return SDValue(); | ||
} | ||
|
||
SDValue DAGCombiner::visitFP_EXTEND(SDNode *N) { | ||
SelectionDAG::FlagInserter FlagsInserter(DAG, N); | ||
SDValue N0 = N->getOperand(0); | ||
EVT VT = N->getValueType(0); | ||
SDLoc DL(N); | ||
|
@@ -18507,6 +18545,9 @@ SDValue DAGCombiner::visitFP_EXTEND(SDNode *N) { | |
if (SDValue NewVSel = matchVSelectOpSizesWithSetCC(N)) | ||
return NewVSel; | ||
|
||
if (SDValue CastEliminated = eliminateFPCastPair(N)) | ||
return CastEliminated; | ||
|
||
return SDValue(); | ||
} | ||
|
||
|
@@ -27209,6 +27250,7 @@ SDValue DAGCombiner::visitFP_TO_FP16(SDNode *N) { | |
} | ||
|
||
SDValue DAGCombiner::visitFP16_TO_FP(SDNode *N) { | ||
SelectionDAG::FlagInserter FlagsInserter(DAG, N); | ||
auto Op = N->getOpcode(); | ||
assert((Op == ISD::FP16_TO_FP || Op == ISD::BF16_TO_FP) && | ||
"opcode should be FP16_TO_FP or BF16_TO_FP."); | ||
|
@@ -27223,6 +27265,9 @@ SDValue DAGCombiner::visitFP16_TO_FP(SDNode *N) { | |
} | ||
} | ||
|
||
if (SDValue CastEliminated = eliminateFPCastPair(N)) | ||
return CastEliminated; | ||
|
||
// Sometimes constants manage to survive very late in the pipeline, e.g., | ||
// because they are wrapped inside the <1 x f16> type. Try one last time to | ||
// get rid of them. | ||
|
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave these cases for a separate patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason for doing it that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because these cases are a pain, and it's not obvious to me this is even tested in the patch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FP16_TO_FP is tested in llvm/test/CodeGen/ARM/fp16_fast_math.ll - ARM uses FP16_TO_FP for fp16-to-float conversion as it doesn't have registers for fp16 types, AArch64 uses FP_EXTEND as it does have such registers. We don't have any BF16 tests though, so I've now added these.