-
Notifications
You must be signed in to change notification settings - Fork 14.3k
SystemZ: Handle gr128 to fp128 copies in copyPhysReg #90861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-systemz Author: Matt Arsenault (arsenm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/90861.diff 2 Files Affected:
diff --git a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
index b3517fb0ea77f5..f4e521c824d6e0 100644
--- a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
@@ -856,6 +856,22 @@ void SystemZInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
return;
}
+ if (SystemZ::FP128BitRegClass.contains(DestReg) &&
+ SystemZ::GR128BitRegClass.contains(SrcReg)) {
+ MCRegister DestRegHi = RI.getSubReg(DestReg, SystemZ::subreg_h64);
+ MCRegister DestRegLo = RI.getSubReg(DestReg, SystemZ::subreg_l64);
+ MCRegister SrcRegHi = RI.getSubReg(SrcReg, SystemZ::subreg_h64);
+ MCRegister SrcRegLo = RI.getSubReg(SrcReg, SystemZ::subreg_l64);
+
+ BuildMI(MBB, MBBI, DL, get(SystemZ::LDGR), DestRegHi)
+ .addReg(SrcRegHi)
+ .addReg(DestReg, RegState::ImplicitDefine);
+
+ BuildMI(MBB, MBBI, DL, get(SystemZ::LDGR), DestRegLo)
+ .addReg(SrcRegLo, getKillRegState(KillSrc));
+ return;
+ }
+
// Move CC value from a GR32.
if (DestReg == SystemZ::CC) {
unsigned Opcode =
diff --git a/llvm/test/CodeGen/SystemZ/copy-phys-reg-gr128-to-fp128.mir b/llvm/test/CodeGen/SystemZ/copy-phys-reg-gr128-to-fp128.mir
new file mode 100644
index 00000000000000..d1de915e81cef7
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/copy-phys-reg-gr128-to-fp128.mir
@@ -0,0 +1,82 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=s390x-ibm-linux -mcpu=z13 -run-pass=postrapseudos -verify-machineinstrs -o - %s | FileCheck %s
+
+---
+name: copy_gr128_to_fp128__r0q_to_f0q
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $r0q
+ ; CHECK-LABEL: name: copy_gr128_to_fp128__r0q_to_f0q
+ ; CHECK: liveins: $r0q
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $f0d = LDGR $r0d, implicit-def $f0q
+ ; CHECK-NEXT: $f2d = LDGR $r1d
+ ; CHECK-NEXT: Return implicit $f0q
+ $f0q = COPY $r0q
+ Return implicit $f0q
+...
+
+---
+name: copy_gr128_to_fp128__r0q_to_f0q_killed
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $r0q
+ ; CHECK-LABEL: name: copy_gr128_to_fp128__r0q_to_f0q_killed
+ ; CHECK: liveins: $r0q
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $f0d = LDGR $r0d, implicit-def $f0q
+ ; CHECK-NEXT: $f2d = LDGR killed $r1d
+ ; CHECK-NEXT: Return implicit $f0q
+ $f0q = COPY killed $r0q
+ Return implicit $f0q
+...
+
+---
+name: copy_gr128_to_fp128__r0q_to_f0q_undef
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $r0q
+ ; CHECK-LABEL: name: copy_gr128_to_fp128__r0q_to_f0q_undef
+ ; CHECK: liveins: $r0q
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $f0q = KILL undef $r0q
+ ; CHECK-NEXT: Return implicit $f0q
+ $f0q = COPY undef $r0q
+ Return implicit $f0q
+...
+
+---
+name: copy_gr128_to_fp128__r0q_to_f0q_subreg0
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $r0q
+ ; CHECK-LABEL: name: copy_gr128_to_fp128__r0q_to_f0q_subreg0
+ ; CHECK: liveins: $r0q
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $f0d = LDGR $r0d, implicit-def $f0q
+ ; CHECK-NEXT: $f2d = LDGR $r1d
+ ; CHECK-NEXT: Return implicit $f0q
+ $f0q = COPY $r0q
+ Return implicit $f0q
+...
+
+---
+name: copy_gr128_to_fp128__r0q_to_f0q_subreg1
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $r0q
+ ; CHECK-LABEL: name: copy_gr128_to_fp128__r0q_to_f0q_subreg1
+ ; CHECK: liveins: $r0q
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $f0d = LDGR $r0d, implicit-def $f0q
+ ; CHECK-NEXT: $f2d = LDGR $r1d
+ ; CHECK-NEXT: Return implicit $f0q
+ $f0q = COPY $r0q
+ Return implicit $f0q
+...
+
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
00eff16
to
3076920
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but see comment about (apparently) duplicated test cases.
shouldCastAtomicLoadInIR is a hack that should be removed. Simple bitcasting of operations should be in the domain of ordinary type legalization and does not need to be done in the IR. This introduces a code quality regression due to the hack currently used to avoid using 128-bit values in the case where the floating point value is ultimately used as an integer. This would be avoidable if there were always a legal 128-bit type (like v2i64). This is a pretty niche situation so I assume it's not important. I implemented about 85% of the work necessary to make v2i64 legal, but it was taking too long and I lack the necessary familiarity with systemz to complete it. I've pushed it here for someone to pick up: https://github.com/arsenm/llvm-project/pull/new/systemz-legal-v2i64 Depends #90861
Hi @arsenm, coming back to this I don't see how this is actually needed (any more). There are no types that can live in both FP128 and GR128, so nothing should ever generate such COPYs. I guess you ran into this with an early version of the f128 atomic load/store bitcast patch, but the current version splits into two 64-bit copies very early and never trigger this code path either. Unless I'm missing something here, I'd prefer to revert this PR again rather than keep this as dead code. |
No, I ran into this with the better version (https://github.com/arsenm/llvm-project/tree/systemz-legal-v2i64). v2i64 should be legal, and avoids all of this custom lowering hacking around the SelectionDAG requirement to avoid illegal types. What's pushed now is where I landed after I gave up on making sure none of the ABIs changed
copyPhysReg really should be comprehensive. This is tested code so I don't see a compelling reason to remove it and I strongly suggest leaving it. This is avoiding future work for someone It's not guaranteed that the DAG never emits these copies. When I was working on the legal type patch, these were slipping in from not-100% precise selection pattern predicates. Having mandatory optimization is bad, and this should always be available as a fallback. The DAG is also not the only possible source of copies, any machine pass should be able to insert them. GlobalIsel also does away with the "type = register class" DAG reasoning. The fact this was missing also shows there's a missed opportunity for RA to avoid spills by copying to these other classes. This would work if there was a new allocatable synthetic class that was the union of FPR128/GPR128 |
OK, fair enough. Thanks for the explanation. Let's keep it in then. |
No description provided.