Skip to content

[RISCV] Don't use x7 as input argument for fastcc when Zicfilp enabled. #93321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

yetingk
Copy link
Contributor

@yetingk yetingk commented May 24, 2024

Zicfilp needs x7(t2) as the landing pad label register.
https://github.com/riscv/riscv-cfi/blob/main/src/cfi_forward.adoc

@llvmbot
Copy link
Member

llvmbot commented May 24, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Yeting Kuo (yetingk)

Changes

Zicfilp needs x7(t2) as the landing pad label register.
https://github.com/riscv/riscv-cfi/blob/main/src/cfi_forward.adoc


Full diff: https://github.com/llvm/llvm-project/pull/93321.diff

1 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+18-8)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index f0e5a7d393b6c..47001cf025337 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -18402,7 +18402,8 @@ ArrayRef<MCPhysReg> RISCV::getArgGPRs(const RISCVABI::ABI ABI) {
   return ArrayRef(ArgIGPRs);
 }
 
-static ArrayRef<MCPhysReg> getFastCCArgGPRs(const RISCVABI::ABI ABI) {
+static ArrayRef<MCPhysReg> getFastCCArgGPRs(const RISCVABI::ABI ABI,
+                                            bool HasZicfilp) {
   // The GPRs used for passing arguments in the FastCC, X5 and X6 might be used
   // for save-restore libcall, so we don't use them.
   static const MCPhysReg FastCCIGPRs[] = {
@@ -18415,10 +18416,17 @@ static ArrayRef<MCPhysReg> getFastCCArgGPRs(const RISCVABI::ABI ABI) {
                                           RISCV::X13, RISCV::X14, RISCV::X15,
                                           RISCV::X7};
 
+  static const MCPhysReg FastCCIGPRsNonX7[] = {
+      RISCV::X10, RISCV::X11, RISCV::X12, RISCV::X13, RISCV::X14, RISCV::X15,
+      RISCV::X16, RISCV::X17, RISCV::X28, RISCV::X29, RISCV::X30, RISCV::X31};
+
+  static const MCPhysReg FastCCEGPRsNonX7[] = {
+      RISCV::X10, RISCV::X11, RISCV::X12, RISCV::X13, RISCV::X14, RISCV::X15};
+
   if (ABI == RISCVABI::ABI_ILP32E || ABI == RISCVABI::ABI_LP64E)
-    return ArrayRef(FastCCEGPRs);
+    return HasZicfilp ? ArrayRef(FastCCEGPRsNonX7) : ArrayRef(FastCCEGPRs);
 
-  return ArrayRef(FastCCIGPRs);
+  return HasZicfilp ? ArrayRef(FastCCIGPRsNonX7) : ArrayRef(FastCCIGPRs);
 }
 
 // Pass a 2*XLEN argument that has been split into two XLEN values through
@@ -18962,15 +18970,16 @@ bool RISCV::CC_RISCV_FastCC(const DataLayout &DL, RISCVABI::ABI ABI,
                             bool IsFixed, bool IsRet, Type *OrigTy,
                             const RISCVTargetLowering &TLI,
                             RVVArgDispatcher &RVVDispatcher) {
+  const RISCVSubtarget &Subtarget = TLI.getSubtarget();
+  bool HasZicfilp = Subtarget.hasStdExtZicfilp();
+
   if (LocVT == MVT::i32 || LocVT == MVT::i64) {
-    if (unsigned Reg = State.AllocateReg(getFastCCArgGPRs(ABI))) {
+    if (unsigned Reg = State.AllocateReg(getFastCCArgGPRs(ABI, HasZicfilp))) {
       State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));
       return false;
     }
   }
 
-  const RISCVSubtarget &Subtarget = TLI.getSubtarget();
-
   if (LocVT == MVT::f16 &&
       (Subtarget.hasStdExtZfh() || Subtarget.hasStdExtZfhmin())) {
     static const MCPhysReg FPR16List[] = {
@@ -19014,7 +19023,7 @@ bool RISCV::CC_RISCV_FastCC(const DataLayout &DL, RISCVABI::ABI ABI,
       (LocVT == MVT::f32 && Subtarget.hasStdExtZfinx()) ||
       (LocVT == MVT::f64 && Subtarget.is64Bit() &&
        Subtarget.hasStdExtZdinx())) {
-    if (unsigned Reg = State.AllocateReg(getFastCCArgGPRs(ABI))) {
+    if (unsigned Reg = State.AllocateReg(getFastCCArgGPRs(ABI, HasZicfilp))) {
       State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));
       return false;
     }
@@ -19049,7 +19058,8 @@ bool RISCV::CC_RISCV_FastCC(const DataLayout &DL, RISCVABI::ABI ABI,
           CCValAssign::getReg(ValNo, ValVT, AllocatedVReg, LocVT, LocInfo));
     } else {
       // Try and pass the address via a "fast" GPR.
-      if (unsigned GPRReg = State.AllocateReg(getFastCCArgGPRs(ABI))) {
+      if (unsigned GPRReg =
+              State.AllocateReg(getFastCCArgGPRs(ABI, HasZicfilp))) {
         LocInfo = CCValAssign::Indirect;
         LocVT = TLI.getSubtarget().getXLenVT();
         State.addLoc(CCValAssign::getReg(ValNo, ValVT, GPRReg, LocVT, LocInfo));

@topperc
Copy link
Collaborator

topperc commented May 24, 2024

Test?

@@ -18415,10 +18416,18 @@ static ArrayRef<MCPhysReg> getFastCCArgGPRs(const RISCVABI::ABI ABI) {
RISCV::X13, RISCV::X14, RISCV::X15,
RISCV::X7};

// Zicfilp needs needs x7(t2) as the landing pad label register.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much do we lose if we just never use X7 for fastcc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do some test in sifive internal.

yetingk pushed a commit to yetingk/llvm-project that referenced this pull request Jun 26, 2024
Like llvm#93321, this patch also tries to solve the conflict usage of x7 for
fastcc and Zicfilp. But this patch removes x7 from fastcc directly. Its
purpose is to reduce the code complexity of llvm#93321, and we also found
that it at most increase 0.02% instruction count for most benchmarks and it
might be benefit to overall benchmarks.
yetingk pushed a commit to yetingk/llvm-project that referenced this pull request Jul 16, 2024
Like llvm#93321, this patch also tries to solve the conflict usage of x7 for
fastcc and Zicfilp. But this patch removes x7 from fastcc directly. Its
purpose is to reduce the code complexity of llvm#93321, and we also found
that it at most increase 0.02% instruction count for most benchmarks and it
might be benefit to overall benchmarks.
yetingk added a commit that referenced this pull request Jul 17, 2024
Like #93321, this patch also tries to solve the conflict usage of x7 for
fastcc and Zicfilp. But this patch removes x7 from fastcc directly. Its
purpose is to reduce the code complexity of #93321, and we also found
that it at most increase 0.02% instruction count for most benchmarks and
it might be benefit for benchmarks.
@yetingk
Copy link
Contributor Author

yetingk commented Jul 18, 2024

Close it since the same target pr #96729 merge.

@yetingk yetingk closed this Jul 18, 2024
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Like #93321, this patch also tries to solve the conflict usage of x7 for
fastcc and Zicfilp. But this patch removes x7 from fastcc directly. Its
purpose is to reduce the code complexity of #93321, and we also found
that it at most increase 0.02% instruction count for most benchmarks and
it might be benefit for benchmarks.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250997
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants