SystemZ: Handle copies between gr64 and fp64 #124890

arsenm · 2025-01-29T06:49:37Z

I'm guessing based on tablegen definitions. I also don't
really understand how this could have been missing.

This defends against regressions in a future peephole-opt
patch.

I'm guessing based on tablegen definitions. I also don't really understand how this could have been missing. This defends against regressions in a future peephole-opt patch.

arsenm · 2025-01-29T06:49:54Z

SystemZ: Handle copies between gr64 and fp64 #124890 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-01-29T06:51:39Z

@llvm/pr-subscribers-backend-systemz

Author: Matt Arsenault (arsenm)

Changes

I'm guessing based on tablegen definitions. I also don't
really understand how this could have been missing.

This defends against regressions in a future peephole-opt
patch.

Full diff: https://github.com/llvm/llvm-project/pull/124890.diff

3 Files Affected:

(modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp (+6)
(added) llvm/test/CodeGen/SystemZ/copy-phys-reg-fp64-to-gr64.mir (+48)
(added) llvm/test/CodeGen/SystemZ/copy-phys-reg-gr64-to-fp64.mir (+47)

diff --git a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
index a6fb5ab0ee9e1b..8a32d998fce2cd 100644
--- a/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
@@ -983,6 +983,12 @@ void SystemZInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
     Opcode = SystemZ::VLR;
   else if (SystemZ::AR32BitRegClass.contains(DestReg, SrcReg))
     Opcode = SystemZ::CPYA;
+  else if (SystemZ::GR64BitRegClass.contains(DestReg) &&
+           SystemZ::FP64BitRegClass.contains(SrcReg))
+    Opcode = SystemZ::LGDR;
+  else if (SystemZ::FP64BitRegClass.contains(DestReg) &&
+           SystemZ::GR64BitRegClass.contains(SrcReg))
+    Opcode = SystemZ::LDGR;
   else
     llvm_unreachable("Impossible reg-to-reg copy");
 
diff --git a/llvm/test/CodeGen/SystemZ/copy-phys-reg-fp64-to-gr64.mir b/llvm/test/CodeGen/SystemZ/copy-phys-reg-fp64-to-gr64.mir
new file mode 100644
index 00000000000000..6fd09d25216e74
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/copy-phys-reg-fp64-to-gr64.mir
@@ -0,0 +1,48 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=s390x-ibm-linux -mcpu=z13 -run-pass=postrapseudos -o - %s | FileCheck %s
+
+---
+name:            copy_fp64_to_gr64__r1d_to_f3d
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $r1d
+    ; CHECK-LABEL: name: copy_fp64_to_gr64__r1d_to_f3d
+    ; CHECK: liveins: $r1d
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $f3d = LDGR $r1d
+    ; CHECK-NEXT: Return implicit $f3d
+    $f3d = COPY $r1d
+    Return implicit $f3d
+...
+
+---
+name:            copy_fp64_to_gr64__r1d_to_f3d_undef
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $r1d
+    ; CHECK-LABEL: name: copy_fp64_to_gr64__r1d_to_f3d_undef
+    ; CHECK: liveins: $r1d
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $f3d = KILL undef $r1d
+    ; CHECK-NEXT: Return implicit $f3d
+    $f3d = COPY undef $r1d
+    Return implicit $f3d
+...
+
+---
+name:            copy_fp64_to_gr64__r1d_to_f3d_killed
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $r1d
+    ; CHECK-LABEL: name: copy_fp64_to_gr64__r1d_to_f3d_killed
+    ; CHECK: liveins: $r1d
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $f3d = LDGR killed $r1d
+    ; CHECK-NEXT: Return implicit $f3d
+    $f3d = COPY killed $r1d
+    Return implicit $f3d
+...
+
diff --git a/llvm/test/CodeGen/SystemZ/copy-phys-reg-gr64-to-fp64.mir b/llvm/test/CodeGen/SystemZ/copy-phys-reg-gr64-to-fp64.mir
new file mode 100644
index 00000000000000..07ef93415bb79e
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/copy-phys-reg-gr64-to-fp64.mir
@@ -0,0 +1,47 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=s390x-ibm-linux -mcpu=z13 -run-pass=postrapseudos -o - %s | FileCheck %s
+---
+name:            copy_fp64_to_gr64__f3d_to_r1d
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $f3d
+    ; CHECK-LABEL: name: copy_fp64_to_gr64__f3d_to_r1d
+    ; CHECK: liveins: $f3d
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $r1d = LGDR $f3d
+    ; CHECK-NEXT: Return implicit $r1d
+    $r1d = COPY $f3d
+    Return implicit $r1d
+...
+
+---
+name:            copy_fp64_to_gr64__f3d_to_r1d_undef
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $f3d
+    ; CHECK-LABEL: name: copy_fp64_to_gr64__f3d_to_r1d_undef
+    ; CHECK: liveins: $f3d
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $r1d = KILL undef $f3d
+    ; CHECK-NEXT: Return implicit $r1d
+    $r1d = COPY undef $f3d
+    Return implicit $r1d
+...
+
+---
+name:            copy_fp64_to_gr64__f3d_to_r1d_killed
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $f3d
+    ; CHECK-LABEL: name: copy_fp64_to_gr64__f3d_to_r1d_killed
+    ; CHECK: liveins: $f3d
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $r1d = LGDR killed $f3d
+    ; CHECK-NEXT: Return implicit $r1d
+    $r1d = COPY killed $f3d
+    Return implicit $r1d
+...
+

uweigand · 2025-01-29T07:40:56Z

I'm guessing based on tablegen definitions. I also don't really understand how this could have been missing.

Well, there is not any data type that is legal in both gr64 and fp64, so these copies would never come up. I'm wondering why this is needed now?

arsenm · 2025-01-29T07:52:56Z

Well, there is not any data type that is legal in both gr64 and fp64, so these copies would never come up. I'm wondering why this is needed now?

There would still be bitcasts, which I would expect to emit a copy. Peephole optimizer doesn't do a great job looking through various subregister patterns, which I'm working on fixing. Uncoalescable copies and bitcasts get rewritten

uweigand · 2025-01-29T11:17:41Z

Well, there is not any data type that is legal in both gr64 and fp64, so these copies would never come up. I'm wondering why this is needed now?

There would still be bitcasts, which I would expect to emit a copy. Peephole optimizer doesn't do a great job looking through various subregister patterns, which I'm working on fixing. Uncoalescable copies and bitcasts get rewritten

Right now, the bitcasts do not emit a copy but rather directly the LDGR/LGDR instruction patterns:

// Moves between 64-bit integer and floating-point registers.
def LGDR : UnaryRRE<"lgdr", 0xB3CD, bitconvert, GR64, FP64>;
def LDGR : UnaryRRE<"ldgr", 0xB3C1, bitconvert, FP64, GR64>;

If we add those instructions to copyPhysReg, should those bitcasts then just emit copies instead?

arsenm · 2025-01-29T11:26:43Z

// Moves between 64-bit integer and floating-point registers.
def LGDR : UnaryRRE<"lgdr", 0xB3CD, bitconvert, GR64, FP64>;
def LDGR : UnaryRRE<"ldgr", 0xB3C1, bitconvert, FP64, GR64>;
If we add those instructions to copyPhysReg, should those bitcasts then just emit copies instead?

Probably. There's a bit of code scattered around to handle "bitcast" instructions, and I'm not really sure why we have it.

uweigand

Thanks. This is in any case OK.

arsenm · 2025-01-30T04:04:23Z

Merge activity

Jan 29, 11:04 PM EST: A user started a stack merge that includes this pull request via Graphite.
Jan 29, 11:08 PM EST: A user merged this pull request with Graphite.

llvm-ci · 2025-01-30T04:35:36Z

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/12708

Here is the relevant piece of the build log for the reference

Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/host_as_target.c' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 8
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/host_as_target.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/host_as_target.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/host_as_target.c.tmp | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/host_as_target.c
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/host_as_target.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/host_as_target.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/host_as_target.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/host_as_target.c
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/host_as_target.c
# `-----------------------------
# error: command failed with exit status: 2

--

********************

SystemZ: Handle copies between gr64 and fp64

f1c75e4

I'm guessing based on tablegen definitions. I also don't really understand how this could have been missing. This defends against regressions in a future peephole-opt patch.

arsenm added the backend:SystemZ label Jan 29, 2025 — with Graphite App

arsenm requested review from uweigand, JonPsson, rsandifo-arm and JonPsson1 and removed request for JonPsson January 29, 2025 06:50

arsenm marked this pull request as ready for review January 29, 2025 06:51

uweigand approved these changes Jan 30, 2025

View reviewed changes

arsenm merged commit 1cbfac0 into main Jan 30, 2025
10 of 12 checks passed

arsenm deleted the users/arsenm/systemz/handle-fp64-gr64-physreg-copies branch January 30, 2025 04:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SystemZ: Handle copies between gr64 and fp64 #124890

SystemZ: Handle copies between gr64 and fp64 #124890

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

llvmbot commented Jan 29, 2025

Uh oh!

uweigand commented Jan 29, 2025

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

uweigand commented Jan 29, 2025

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

uweigand left a comment

Uh oh!

arsenm commented Jan 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

llvm-ci commented Jan 30, 2025

Uh oh!

Uh oh!

SystemZ: Handle copies between gr64 and fp64 #124890

SystemZ: Handle copies between gr64 and fp64 #124890

Uh oh!

Conversation

arsenm commented Jan 29, 2025

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

llvmbot commented Jan 29, 2025

Uh oh!

uweigand commented Jan 29, 2025

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

uweigand commented Jan 29, 2025

Uh oh!

arsenm commented Jan 29, 2025

Uh oh!

uweigand left a comment

Choose a reason for hiding this comment

Uh oh!

arsenm commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

llvm-ci commented Jan 30, 2025

Uh oh!

Uh oh!

arsenm commented Jan 30, 2025 •

edited

Loading