-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SystemZ] Don't use FP Load and Test as comparisons to same reg #78074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
You can test this locally with the following command:git-clang-format --diff e2ce91f48cd606955ce125b009ccc6b5464cb05f 959e8888f17e89a40811e8e9f5ec18035fffc69d -- llvm/lib/Target/SystemZ/SystemZElimCompare.cpp llvm/lib/Target/SystemZ/SystemZISelLowering.cpp llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp b/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp
index 7423ed429f..5acd6b56ba 100644
--- a/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp
@@ -495,7 +495,7 @@ static bool isCompareZero(MachineInstr &Compare) {
if (isLoadAndTestAsCmp(Compare))
return true;
return Compare.getNumExplicitOperands() == 2 &&
- Compare.getOperand(1).isImm() && Compare.getOperand(1).getImm() == 0;
+ Compare.getOperand(1).isImm() && Compare.getOperand(1).getImm() == 0;
}
// Try to optimize cases where comparison instruction Compare is testing
diff --git a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
index bf6547cc87..5600fab54f 100644
--- a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
@@ -1691,7 +1691,8 @@ unsigned SystemZInstrInfo::getLoadAndTest(unsigned Opcode) const {
case SystemZ::LGF: return SystemZ::LTGF;
case SystemZ::LR: return SystemZ::LTR;
case SystemZ::LGFR: return SystemZ::LTGFR;
- case SystemZ::LGR: return SystemZ::LTGR;
+ case SystemZ::LGR:
+ return SystemZ::LTGR;
case SystemZ::LCDFR: return SystemZ::LCDBR;
case SystemZ::LPDFR: return SystemZ::LPDBR;
case SystemZ::LNDFR: return SystemZ::LNDBR;
|
That flag means that the user guarantees the exception flags will not be looked at (which is the default for LLVM floating-point operations unless some variation of strictfp is used). The pseudo will have a correct setting of that flag based on the selection settings (strict FP or not), and this simply needs to be transferred to the actual MI instruction.
No, you cannot always clear the flag, it needs to be copied over from the pseudo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline comment, otherwise LGTM.
…#78074) The usage of FP Load and Test instructions as a comparison against zero with the assumption that the dest reg will always reflect the source reg is actually incorrect: Unfortunately, a SNaN will be converted to a QNaN, so the instruction may actually change the value as opposed to being a pure register move with a test. This patch - changes instruction selection to always emit FP LT with a scratch def reg, which will typically be allocated to the same reg if dead. - Removes the conversions into FP LT in SystemZElimcompare.
The usage of FP Load and Test instructions as a comparison against zero with
the assumption that the dest reg will always reflect the source reg is actually
incorrect. Unfortunately, a SNaN will be converted to a QNaN, so the instruction
may actually change the value as opposed to being a pure register move with a
test.
This patch
will typically be allocated to the same reg if dead.
The overall impact of this on benchmarks is very limited, just some ~50 more
COPYs in total.
I tried checking for a single use on zEC12 and in those cases emit the compare form
that has two use operands instead of def/use, but when I removed that I saw that it
had basically no impact (just some ~300 LT using a scratch reg, but no more
spilling/copying). Therefore, this was removed entirely as a simplification.
I tried removing the "forward scan", but that is still valuable with some ~7k L/LG
transformations.
The one open end I am not sure about is the NoFPExcept flag? In the POP it says
that an "exception is recognized". Since the LT may actually raise an exception, it
should not have the NoFPExcept flag. On the other hand, that flag itself should
not be there unless the input value is guaranteed to not be a NaN..?
(In emitLoadAndTestCmp0, I first cleared that flag, but then realized that adjustCCMasksForInstr()
in SystemZElimCompare will then bail from the transformation because of this.)