Skip to content

[PowerPC] Expand global named register support #113482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 24, 2024

Conversation

lei137
Copy link
Contributor

@lei137 lei137 commented Oct 23, 2024

Enable all valid registers for intrinsics that read from and write
to global named registers.

@lei137 lei137 self-assigned this Oct 23, 2024
@lei137 lei137 requested a review from nathanchance October 23, 2024 18:17
@lei137 lei137 marked this pull request as ready for review October 23, 2024 18:19
@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2024

@llvm/pr-subscribers-backend-powerpc

Author: Lei Huang (lei137)

Changes

Enable all valid registers for intrinsics that read from and write
to global named registers.


Full diff: https://github.com/llvm/llvm-project/pull/113482.diff

5 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.cpp (+22-14)
  • (modified) llvm/test/CodeGen/PowerPC/named-reg-alloc-r0.ll (+1-3)
  • (modified) llvm/test/CodeGen/PowerPC/named-reg-alloc-r2-64.ll (+1-3)
  • (modified) llvm/test/CodeGen/PowerPC/named-reg-alloc-r2.ll (+1-3)
  • (added) llvm/test/CodeGen/PowerPC/named-reg-alloc.ll (+144)
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 7199fac9b110b6..ab31898e262e7e 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -17367,25 +17367,33 @@ SDValue PPCTargetLowering::LowerFRAMEADDR(SDValue Op,
   return FrameAddr;
 }
 
-// FIXME? Maybe this could be a TableGen attribute on some registers and
-// this table could be generated automatically from RegInfo.
-Register PPCTargetLowering::getRegisterByName(const char* RegName, LLT VT,
+#define GET_REGISTER_MATCHER
+#include "PPCGenAsmMatcher.inc"
+
+Register PPCTargetLowering::getRegisterByName(const char *RegName, LLT VT,
                                               const MachineFunction &MF) const {
-  bool isPPC64 = Subtarget.isPPC64();
+  bool IsPPC64 = Subtarget.isPPC64();
 
-  bool is64Bit = isPPC64 && VT == LLT::scalar(64);
-  if (!is64Bit && VT != LLT::scalar(32))
+  bool Is64Bit = IsPPC64 && VT == LLT::scalar(64);
+  if (!Is64Bit && VT != LLT::scalar(32))
     report_fatal_error("Invalid register global variable type");
 
-  Register Reg = StringSwitch<Register>(RegName)
-                     .Case("r1", is64Bit ? PPC::X1 : PPC::R1)
-                     .Case("r2", isPPC64 ? Register() : PPC::R2)
-                     .Case("r13", (is64Bit ? PPC::X13 : PPC::R13))
-                     .Default(Register());
+  Register Reg = MatchRegisterName(RegName);
+  if (!Reg)
+    report_fatal_error(
+        Twine("Invalid global name register \"" + StringRef(RegName) + "\"."));
+
+  // FIXME: Unable to generate code for `-O2` but okay for `-O0`.
+  // Need followup investigation as to why.
+  if ((IsPPC64 && Reg == PPC::R2) || Reg == PPC::R0)
+    report_fatal_error(Twine("Trying to reserve an invalid register \"" +
+                             StringRef(RegName) + "\"."));
+
+  // Convert GPR to GP8R register for 64bit.
+  if (Is64Bit && StringRef(RegName).starts_with_insensitive("r"))
+    Reg = Reg.id() - PPC::R0 + PPC::X0;
 
-  if (Reg)
-    return Reg;
-  report_fatal_error("Invalid register name global variable");
+  return Reg;
 }
 
 bool PPCTargetLowering::isAccessedAsGotIndirect(SDValue GA) const {
diff --git a/llvm/test/CodeGen/PowerPC/named-reg-alloc-r0.ll b/llvm/test/CodeGen/PowerPC/named-reg-alloc-r0.ll
index 11cb72296e2c43..8f4351f795ec94 100644
--- a/llvm/test/CodeGen/PowerPC/named-reg-alloc-r0.ll
+++ b/llvm/test/CodeGen/PowerPC/named-reg-alloc-r0.ll
@@ -1,11 +1,9 @@
 ; RUN: not --crash llc < %s -mtriple=powerpc-unknown-linux-gnu 2>&1 | FileCheck %s
-; RUN: not --crash llc < %s -mtriple=powerpc-unknown-linux-gnu 2>&1 | FileCheck %s
 ; RUN: not --crash llc < %s -mtriple=powerpc64-unknown-linux-gnu 2>&1 | FileCheck %s
 
 define i32 @get_reg() nounwind {
 entry:
-; FIXME: Include an allocatable-specific error message
-; CHECK: Invalid register name global variable
+; CHECK: Trying to reserve an invalid register "r0".
         %reg = call i32 @llvm.read_register.i32(metadata !0)
   ret i32 %reg
 }
diff --git a/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2-64.ll b/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2-64.ll
index 3df778f445c733..fdf7ea2711d827 100644
--- a/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2-64.ll
+++ b/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2-64.ll
@@ -1,10 +1,8 @@
 ; RUN: not --crash llc < %s -mtriple=powerpc64-unknown-linux-gnu 2>&1 | FileCheck %s
-; RUN: not --crash llc < %s -mtriple=powerpc64-unknown-linux-gnu 2>&1 | FileCheck %s
 
 define i64 @get_reg() nounwind {
 entry:
-; FIXME: Include an allocatable-specific error message
-; CHECK: Invalid register name global variable
+; CHECK: Trying to reserve an invalid register "r2".
         %reg = call i64 @llvm.read_register.i64(metadata !0)
   ret i64 %reg
 }
diff --git a/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2.ll b/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2.ll
index ca79f857548ebe..c09b8bb3f49c31 100644
--- a/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2.ll
+++ b/llvm/test/CodeGen/PowerPC/named-reg-alloc-r2.ll
@@ -3,11 +3,9 @@
 
 define i32 @get_reg() nounwind {
 entry:
-; FIXME: Include an allocatable-specific error message
-; CHECK-NOTPPC32: Invalid register name global variable
+; CHECK-NOTPPC32: Trying to reserve an invalid register "r2".
         %reg = call i32 @llvm.read_register.i32(metadata !0)
   ret i32 %reg
-
 ; CHECK-LABEL: @get_reg
 ; CHECK: mr 3, 2
 }
diff --git a/llvm/test/CodeGen/PowerPC/named-reg-alloc.ll b/llvm/test/CodeGen/PowerPC/named-reg-alloc.ll
new file mode 100644
index 00000000000000..38d22475ead910
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/named-reg-alloc.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -O0 -verify-machineinstrs < %s -mtriple=powerpc-unknown-linux-gnu 2>&1 | FileCheck %s
+; RUN: llc -O0 -verify-machineinstrs < %s -mtriple=powerpc64-unknown-linux-gnu 2>&1 | FileCheck %s --check-prefix=CHECK64
+
+@mVal = dso_local global i32 15, align 4
+@myGVal = dso_local global i32 0, align 4
+
+define dso_local void @testSetIntReg(i32 noundef signext %xx) {
+; CHECK-LABEL: testSetIntReg:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    mr 5, 3
+; CHECK-NEXT:    blr
+;
+; CHECK64-LABEL: testSetIntReg:
+; CHECK64:       # %bb.0: # %entry
+; CHECK64-NEXT:    mr 5, 3
+; CHECK64-NEXT:    blr
+entry:
+  tail call void @llvm.write_register.i32(metadata !0, i32 %xx)
+  ret void
+}
+
+declare void @llvm.write_register.i32(metadata, i32)
+
+define dso_local signext range(i32 0, 2) i32 @testCmpReg() {
+; CHECK-LABEL: testCmpReg:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    lis 3, mVal@ha
+; CHECK-NEXT:    lwz 3, mVal@l(3)
+; CHECK-NEXT:    xori 3, 3, 15
+; CHECK-NEXT:    cntlzw 3, 3
+; CHECK-NEXT:    srwi 3, 3, 5
+; CHECK-NEXT:    blr
+;
+; CHECK64-LABEL: testCmpReg:
+; CHECK64:       # %bb.0: # %entry
+; CHECK64-NEXT:    addis 3, 2, mVal@toc@ha
+; CHECK64-NEXT:    addi 3, 3, mVal@toc@l
+; CHECK64-NEXT:    lwz 3, 0(3)
+; CHECK64-NEXT:    xori 3, 3, 15
+; CHECK64-NEXT:    cntlzw 3, 3
+; CHECK64-NEXT:    srwi 3, 3, 5
+; CHECK64-NEXT:    extsw 3, 3
+; CHECK64-NEXT:    blr
+entry:
+  tail call void @llvm.write_register.i32(metadata !0, i32 15)
+  %0 = load i32, ptr @mVal, align 4
+  %1 = tail call i32 @llvm.read_register.i32(metadata !0)
+  %cmp = icmp eq i32 %0, %1
+  %conv = zext i1 %cmp to i32
+  ret i32 %conv
+}
+
+declare i32 @llvm.read_register.i32(metadata)
+
+define dso_local void @testSetIntReg2(i32 noundef signext %xx) {
+; CHECK-LABEL: testSetIntReg2:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    stwu 1, -48(1)
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-NEXT:    .cfi_offset r23, -36
+; CHECK-NEXT:    stw 23, 12(1) # 4-byte Folded Spill
+; CHECK-NEXT:    mr 23, 3
+; CHECK-NEXT:    lwz 23, 12(1) # 4-byte Folded Reload
+; CHECK-NEXT:    addi 1, 1, 48
+; CHECK-NEXT:    blr
+;
+; CHECK64-LABEL: testSetIntReg2:
+; CHECK64:       # %bb.0: # %entry
+; CHECK64-NEXT:    std 23, -72(1) # 8-byte Folded Spill
+; CHECK64-NEXT:    mr 23, 3
+; CHECK64-NEXT:    ld 23, -72(1) # 8-byte Folded Reload
+; CHECK64-NEXT:    blr
+entry:
+  tail call void @llvm.write_register.i32(metadata !1, i32 %xx)
+  ret void
+}
+
+define dso_local signext i32 @testReturnReg() {
+; CHECK-LABEL: testReturnReg:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    stwu 1, -48(1)
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-NEXT:    .cfi_offset r23, -36
+; CHECK-NEXT:    stw 23, 12(1) # 4-byte Folded Spill
+; CHECK-NEXT:    li 23, 125
+; CHECK-NEXT:    mr 3, 23
+; CHECK-NEXT:    lwz 23, 12(1) # 4-byte Folded Reload
+; CHECK-NEXT:    addi 1, 1, 48
+; CHECK-NEXT:    blr
+;
+; CHECK64-LABEL: testReturnReg:
+; CHECK64:       # %bb.0: # %entry
+; CHECK64-NEXT:    std 23, -72(1) # 8-byte Folded Spill
+; CHECK64-NEXT:    li 23, 125
+; CHECK64-NEXT:    extsw 3, 23
+; CHECK64-NEXT:    ld 23, -72(1) # 8-byte Folded Reload
+; CHECK64-NEXT:    blr
+entry:
+  tail call void @llvm.write_register.i32(metadata !1, i32 125)
+  %0 = tail call i32 @llvm.read_register.i32(metadata !1)
+  ret i32 %0
+}
+
+define dso_local void @testViaASM(i32 noundef signext %xx) {
+; CHECK-LABEL: testViaASM:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    stwu 1, -64(1)
+; CHECK-NEXT:    .cfi_def_cfa_offset 64
+; CHECK-NEXT:    .cfi_offset r20, -48
+; CHECK-NEXT:    stw 20, 16(1) # 4-byte Folded Spill
+; CHECK-NEXT:    mr 20, 3
+; CHECK-NEXT:    #APP
+; CHECK-NEXT:    addi 3, 1, 1
+; CHECK-NEXT:    #NO_APP
+; CHECK-NEXT:    lis 4, myGVal@ha
+; CHECK-NEXT:    stw 3, myGVal@l(4)
+; CHECK-NEXT:    lwz 20, 16(1) # 4-byte Folded Reload
+; CHECK-NEXT:    addi 1, 1, 64
+; CHECK-NEXT:    blr
+;
+; CHECK64-LABEL: testViaASM:
+; CHECK64:       # %bb.0: # %entry
+; CHECK64-NEXT:    std 20, -96(1) # 8-byte Folded Spill
+; CHECK64-NEXT:    mr 20, 3
+; CHECK64-NEXT:    #APP
+; CHECK64-NEXT:    addi 3, 1, 1
+; CHECK64-NEXT:    #NO_APP
+; CHECK64-NEXT:    addis 4, 2, myGVal@toc@ha
+; CHECK64-NEXT:    addi 4, 4, myGVal@toc@l
+; CHECK64-NEXT:    stw 3, 0(4)
+; CHECK64-NEXT:    ld 20, -96(1) # 8-byte Folded Reload
+; CHECK64-NEXT:    blr
+entry:
+  tail call void @llvm.write_register.i32(metadata !2, i32 %xx)
+  %0 = tail call i32 @llvm.read_register.i32(metadata !2)
+  %1 = tail call i32 asm "addi $0, $2, $2", "=r,{r20},K"(i32 %0, i32 1)
+  store i32 %1, ptr @myGVal, align 4
+  ret void
+}
+
+!0 = !{!"r5"}
+!1 = !{!"r23"}
+!2 = !{!"r20"}

Copy link
Member

@redstar redstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@nathanchance
Copy link
Member

This version of the change does not have the issue that I reported earlier. Thanks for the quick fix!

@lei137 lei137 merged commit 522f34c into llvm:main Oct 24, 2024
12 checks passed
@lei137 lei137 deleted the lei/getRegisterByName branch October 24, 2024 14:05
// this table could be generated automatically from RegInfo.
Register PPCTargetLowering::getRegisterByName(const char* RegName, LLT VT,
#define GET_REGISTER_MATCHER
#include "PPCGenAsmMatcher.inc"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most targets don't include AsmMatcher outside of their AsmPartner subtarget, see also https://reviews.llvm.org/D69130#inline-627969

It'd be nice if there was a way to have nicer deps here if possible.

(If this is the best that's possible, it's fine as-is. I thought I'd raise this though, and maybe there's a nice fix.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is a bit strange when I was doing it. However the table gen function was exactly what I needed and I didn't want to duplicate it here. Unfortunately I don't have a nice fix for this at this time. Any suggestion?

nico added a commit that referenced this pull request Oct 24, 2024
Similar to 4168a2e. See also comment on #113482.
@frobtech frobtech mentioned this pull request Oct 25, 2024
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Enable all valid registers for intrinsics that read from and write
to global named registers.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Similar to 4168a2e. See also comment on llvm#113482.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants