Skip to content

AArch64: Allow ZEXT+COPY -> FMOV peephole for ZPR registers as well #135436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025

Conversation

MatzeB
Copy link
Contributor

@MatzeB MatzeB commented Apr 11, 2025

No description provided.

@MatzeB
Copy link
Contributor Author

MatzeB commented Apr 11, 2025

This avoids an unnecessary mov w8, w8 in cases like:

char* f(char* p, __SVUint32_t s) {
  unsigned a = s[0];
  return p + a;
}

@MatzeB MatzeB marked this pull request as ready for review April 11, 2025 20:37
@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Matthias Braun (MatzeB)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/135436.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp (+5-2)
  • (modified) llvm/test/CodeGen/AArch64/peephole-orr.mir (+44-1)
diff --git a/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp b/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp
index 36a7becbc76d3..71efeaf0d1b88 100644
--- a/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp
+++ b/llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp
@@ -263,15 +263,18 @@ bool AArch64MIPeepholeOpt::visitORR(MachineInstr &MI) {
     // A COPY from an FPR will become a FMOVSWr, so do so now so that we know
     // that the upper bits are zero.
     if (RC != &AArch64::FPR32RegClass &&
-        ((RC != &AArch64::FPR64RegClass && RC != &AArch64::FPR128RegClass) ||
+        ((RC != &AArch64::FPR64RegClass && RC != &AArch64::FPR128RegClass &&
+          RC != &AArch64::ZPRRegClass) ||
          SrcMI->getOperand(1).getSubReg() != AArch64::ssub))
       return false;
-    Register CpySrc = SrcMI->getOperand(1).getReg();
+    Register CpySrc;
     if (SrcMI->getOperand(1).getSubReg() == AArch64::ssub) {
       CpySrc = MRI->createVirtualRegister(&AArch64::FPR32RegClass);
       BuildMI(*SrcMI->getParent(), SrcMI, SrcMI->getDebugLoc(),
               TII->get(TargetOpcode::COPY), CpySrc)
           .add(SrcMI->getOperand(1));
+    } else {
+      CpySrc = SrcMI->getOperand(1).getReg();
     }
     BuildMI(*SrcMI->getParent(), SrcMI, SrcMI->getDebugLoc(),
             TII->get(AArch64::FMOVSWr), SrcMI->getOperand(0).getReg())
diff --git a/llvm/test/CodeGen/AArch64/peephole-orr.mir b/llvm/test/CodeGen/AArch64/peephole-orr.mir
index 3431676438bd2..f718328ecf2d6 100644
--- a/llvm/test/CodeGen/AArch64/peephole-orr.mir
+++ b/llvm/test/CodeGen/AArch64/peephole-orr.mir
@@ -1,6 +1,49 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
 # RUN: llc -run-pass=aarch64-mi-peephole-opt -o - -mtriple=aarch64-unknown-linux -verify-machineinstrs %s | FileCheck %s
-
+---
+name: copy_fpr128_gpr32
+body: |
+  bb.0:
+    liveins: $q0
+    ; CHECK-LABEL: name: copy_fpr128_gpr32
+    ; CHECK: liveins: $q0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:fpr128 = COPY $q0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:fpr32 = COPY [[COPY]].ssub
+    ; CHECK-NEXT: [[FMOVSWr:%[0-9]+]]:gpr32 = FMOVSWr [[COPY1]]
+    %0:fpr128 = COPY $q0
+    %1:gpr32 = COPY %0.ssub:fpr128
+    %2:gpr32 = ORRWrs $wzr, killed %1:gpr32, 0
+...
+---
+name: copy_fpr32_gpr32
+body: |
+  bb.0:
+    liveins: $s0
+    ; CHECK-LABEL: name: copy_fpr32_gpr32
+    ; CHECK: liveins: $s0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:fpr32 = COPY $s0
+    ; CHECK-NEXT: [[FMOVSWr:%[0-9]+]]:gpr32 = FMOVSWr [[COPY]]
+    %0:fpr32 = COPY $s0
+    %1:gpr32 = COPY %0:fpr32
+    %2:gpr32 = ORRWrs $wzr, killed %1:gpr32, 0
+...
+---
+name: copy_zpr_gpr32
+body: |
+  bb.0:
+    liveins: $z0
+    ; CHECK-LABEL: name: copy_zpr_gpr32
+    ; CHECK: liveins: $z0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:zpr = COPY $z0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:fpr32 = COPY [[COPY]].ssub
+    ; CHECK-NEXT: [[FMOVSWr:%[0-9]+]]:gpr32 = FMOVSWr [[COPY1]]
+    %0:zpr = COPY $z0
+    %1:gpr32 = COPY %0.ssub:zpr
+    %2:gpr32 = ORRWrs $wzr, killed %1:gpr32, 0
+...
 ---
 name: copy_multiple_uses
 tracksRegLiveness: true

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MatzeB MatzeB merged commit ed96e46 into llvm:main Apr 14, 2025
13 of 15 checks passed
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants