Skip to content

[RISCV] Fix wrong offset use caused by missing the size of Zcmp push. #66613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 25, 2023

Conversation

yetingk
Copy link
Contributor

@yetingk yetingk commented Sep 18, 2023

This fixes two wrong offset uses,

  1. .cfi_offset of callee saves are not pushed by cm.push.
  2. Reference of frame objests by frame pointer.

@yetingk yetingk requested review from tclin914, Xinlong-Wu and a team September 18, 2023 06:09
@llvmbot
Copy link
Member

llvmbot commented Sep 18, 2023

@llvm/pr-subscribers-backend-risc-v

Changes

This fixes two wrong offset uses,

  1. .cfi_offset of callee saves are not pushed by cm.push.
  2. Reference of frame objests by frame pointer.

Full diff: https://github.com/llvm/llvm-project/pull/66613.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+6-10)
  • (modified) llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h (+4)
  • (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+124)
  • (added) llvm/test/CodeGen/RISCV/zcmp-with-float.ll (+46)
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index add933250f8473d..4b28b9d56ff10c6 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -515,8 +515,7 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF,
   // FIXME (note copied from Lanai): This appears to be overallocating.  Needs
   // investigation. Get the number of bytes to allocate from the FrameInfo.
   uint64_t StackSize = getStackSizeWithRVVPadding(MF);
-  uint64_t RealStackSize =
-      StackSize + RVFI->getLibCallStackSize() + RVFI->getRVPushStackSize();
+  uint64_t RealStackSize = StackSize + RVFI->getReservedSpillsSize();
   uint64_t RVVStackSize = RVFI->getRVVStackSize();
 
   // Early exit if there is no need to allocate on the stack
@@ -585,8 +584,7 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF,
         Offset = FrameIdx * (int64_t)STI.getXLen() / 8;
       }
     } else {
-      Offset = MFI.getObjectOffset(FrameIdx) -
-               RVFI->getLibCallStackSize();
+      Offset = MFI.getObjectOffset(FrameIdx) - RVFI->getReservedSpillsSize();
     }
     Register Reg = Entry.getReg();
     unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
@@ -731,8 +729,7 @@ void RISCVFrameLowering::emitEpilogue(MachineFunction &MF,
     LastFrameDestroy = std::prev(MBBI, CSI.size());
 
   uint64_t StackSize = getStackSizeWithRVVPadding(MF);
-  uint64_t RealStackSize =
-      StackSize + RVFI->getLibCallStackSize() + RVFI->getRVPushStackSize();
+  uint64_t RealStackSize = StackSize + RVFI->getReservedSpillsSize();
   uint64_t FPOffset = RealStackSize - RVFI->getVarArgsSaveSize();
   uint64_t RVVStackSize = RVFI->getRVVStackSize();
 
@@ -883,7 +880,7 @@ RISCVFrameLowering::getFrameIndexReference(const MachineFunction &MF, int FI,
   if (FrameReg == getFPReg(STI)) {
     Offset += StackOffset::getFixed(RVFI->getVarArgsSaveSize());
     if (FI >= 0)
-      Offset -= StackOffset::getFixed(RVFI->getLibCallStackSize());
+      Offset -= StackOffset::getFixed(RVFI->getReservedSpillsSize());
     // When using FP to access scalable vector objects, we need to minus
     // the frame size.
     //
@@ -951,8 +948,7 @@ RISCVFrameLowering::getFrameIndexReference(const MachineFunction &MF, int FI,
       assert(!RI->hasStackRealignment(MF) &&
              "Can't index across variable sized realign");
       Offset += StackOffset::get(getStackSizeWithRVVPadding(MF) +
-                                     RVFI->getLibCallStackSize() +
-                                     RVFI->getRVPushStackSize(),
+                                     RVFI->getReservedSpillsSize(),
                                  RVFI->getRVVStackSize());
     } else {
       Offset += StackOffset::getFixed(MFI.getStackSize());
@@ -1297,7 +1293,7 @@ RISCVFrameLowering::getFirstSPAdjustAmount(const MachineFunction &MF) const {
   // Disable SplitSPAdjust if save-restore libcall is used. The callee-saved
   // registers will be pushed by the save-restore libcalls, so we don't have to
   // split the SP adjustment in this case.
-  if (RVFI->getLibCallStackSize() || RVFI->getRVPushStackSize())
+  if (RVFI->getReservedSpillsSize())
     return 0;
 
   // Return the FirstSPAdjustAmount if the StackSize can not fit in a signed
diff --git a/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h b/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
index 099ebb4014ca46f..4ffdb6145698f40 100644
--- a/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h
@@ -104,6 +104,10 @@ class RISCVMachineFunctionInfo : public MachineFunctionInfo {
     BranchRelaxationScratchFrameIndex = Index;
   }
 
+  unsigned getReservedSpillsSize() const {
+    return LibCallStackSize + RVPushStackSize;
+  }
+
   unsigned getLibCallStackSize() const { return LibCallStackSize; }
   void setLibCallStackSize(unsigned Size) { LibCallStackSize = Size; }
 
diff --git a/llvm/test/CodeGen/RISCV/push-pop-popret.ll b/llvm/test/CodeGen/RISCV/push-pop-popret.ll
index af3828ed7d839ce..353b015e3e7a0d3 100644
--- a/llvm/test/CodeGen/RISCV/push-pop-popret.ll
+++ b/llvm/test/CodeGen/RISCV/push-pop-popret.ll
@@ -3130,3 +3130,127 @@ define void @callee_no_irq() nounwind{
   store volatile [32 x i32] %val, [32 x i32]* @var_test_irq
   ret void
 }
+
+declare void @bar(ptr, ptr)
+declare ptr @llvm.frameaddress.p0(i32 immarg)
+
+define i32 @use_fp(i32 %x) {
+; RV32IZCMP-LABEL: use_fp:
+; RV32IZCMP:       # %bb.0: # %entry
+; RV32IZCMP-NEXT:    cm.push {ra, s0-s1}, -32
+; RV32IZCMP-NEXT:    .cfi_def_cfa_offset 32
+; RV32IZCMP-NEXT:    .cfi_offset ra, -12
+; RV32IZCMP-NEXT:    .cfi_offset s0, -8
+; RV32IZCMP-NEXT:    .cfi_offset s1, -4
+; RV32IZCMP-NEXT:    addi s0, sp, 32
+; RV32IZCMP-NEXT:    .cfi_def_cfa s0, 0
+; RV32IZCMP-NEXT:    mv s1, a0
+; RV32IZCMP-NEXT:    addi a1, s0, -20
+; RV32IZCMP-NEXT:    mv a0, s0
+; RV32IZCMP-NEXT:    call bar@plt
+; RV32IZCMP-NEXT:    mv a0, s1
+; RV32IZCMP-NEXT:    cm.popret {ra, s0-s1}, 32
+;
+; RV64IZCMP-LABEL: use_fp:
+; RV64IZCMP:       # %bb.0: # %entry
+; RV64IZCMP-NEXT:    cm.push {ra, s0-s1}, -48
+; RV64IZCMP-NEXT:    .cfi_def_cfa_offset 48
+; RV64IZCMP-NEXT:    .cfi_offset ra, -24
+; RV64IZCMP-NEXT:    .cfi_offset s0, -16
+; RV64IZCMP-NEXT:    .cfi_offset s1, -8
+; RV64IZCMP-NEXT:    addi s0, sp, 48
+; RV64IZCMP-NEXT:    .cfi_def_cfa s0, 0
+; RV64IZCMP-NEXT:    mv s1, a0
+; RV64IZCMP-NEXT:    addi a1, s0, -36
+; RV64IZCMP-NEXT:    mv a0, s0
+; RV64IZCMP-NEXT:    call bar@plt
+; RV64IZCMP-NEXT:    mv a0, s1
+; RV64IZCMP-NEXT:    cm.popret {ra, s0-s1}, 48
+;
+; RV32IZCMP-SR-LABEL: use_fp:
+; RV32IZCMP-SR:       # %bb.0: # %entry
+; RV32IZCMP-SR-NEXT:    call t0, __riscv_save_2
+; RV32IZCMP-SR-NEXT:    addi sp, sp, -16
+; RV32IZCMP-SR-NEXT:    .cfi_def_cfa_offset 32
+; RV32IZCMP-SR-NEXT:    .cfi_offset ra, -4
+; RV32IZCMP-SR-NEXT:    .cfi_offset s0, -8
+; RV32IZCMP-SR-NEXT:    .cfi_offset s1, -12
+; RV32IZCMP-SR-NEXT:    addi s0, sp, 32
+; RV32IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
+; RV32IZCMP-SR-NEXT:    mv s1, a0
+; RV32IZCMP-SR-NEXT:    addi a1, s0, -20
+; RV32IZCMP-SR-NEXT:    mv a0, s0
+; RV32IZCMP-SR-NEXT:    call bar@plt
+; RV32IZCMP-SR-NEXT:    mv a0, s1
+; RV32IZCMP-SR-NEXT:    addi sp, sp, 16
+; RV32IZCMP-SR-NEXT:    tail __riscv_restore_2
+;
+; RV64IZCMP-SR-LABEL: use_fp:
+; RV64IZCMP-SR:       # %bb.0: # %entry
+; RV64IZCMP-SR-NEXT:    call t0, __riscv_save_2
+; RV64IZCMP-SR-NEXT:    addi sp, sp, -16
+; RV64IZCMP-SR-NEXT:    .cfi_def_cfa_offset 48
+; RV64IZCMP-SR-NEXT:    .cfi_offset ra, -8
+; RV64IZCMP-SR-NEXT:    .cfi_offset s0, -16
+; RV64IZCMP-SR-NEXT:    .cfi_offset s1, -24
+; RV64IZCMP-SR-NEXT:    addi s0, sp, 48
+; RV64IZCMP-SR-NEXT:    .cfi_def_cfa s0, 0
+; RV64IZCMP-SR-NEXT:    mv s1, a0
+; RV64IZCMP-SR-NEXT:    addi a1, s0, -36
+; RV64IZCMP-SR-NEXT:    mv a0, s0
+; RV64IZCMP-SR-NEXT:    call bar@plt
+; RV64IZCMP-SR-NEXT:    mv a0, s1
+; RV64IZCMP-SR-NEXT:    addi sp, sp, 16
+; RV64IZCMP-SR-NEXT:    tail __riscv_restore_2
+;
+; RV32I-LABEL: use_fp:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    .cfi_def_cfa_offset 16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sw s0, 8(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sw s1, 4(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    .cfi_offset ra, -4
+; RV32I-NEXT:    .cfi_offset s0, -8
+; RV32I-NEXT:    .cfi_offset s1, -12
+; RV32I-NEXT:    addi s0, sp, 16
+; RV32I-NEXT:    .cfi_def_cfa s0, 0
+; RV32I-NEXT:    mv s1, a0
+; RV32I-NEXT:    addi a1, s0, -16
+; RV32I-NEXT:    mv a0, s0
+; RV32I-NEXT:    call bar@plt
+; RV32I-NEXT:    mv a0, s1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    lw s0, 8(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    lw s1, 4(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: use_fp:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    addi sp, sp, -32
+; RV64I-NEXT:    .cfi_def_cfa_offset 32
+; RV64I-NEXT:    sd ra, 24(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    sd s0, 16(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    sd s1, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    .cfi_offset ra, -8
+; RV64I-NEXT:    .cfi_offset s0, -16
+; RV64I-NEXT:    .cfi_offset s1, -24
+; RV64I-NEXT:    addi s0, sp, 32
+; RV64I-NEXT:    .cfi_def_cfa s0, 0
+; RV64I-NEXT:    mv s1, a0
+; RV64I-NEXT:    addi a1, s0, -28
+; RV64I-NEXT:    mv a0, s0
+; RV64I-NEXT:    call bar@plt
+; RV64I-NEXT:    mv a0, s1
+; RV64I-NEXT:    ld ra, 24(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    ld s0, 16(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    ld s1, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    addi sp, sp, 32
+; RV64I-NEXT:    ret
+entry:
+  %var = alloca i32, align 4
+  %0 = tail call ptr @llvm.frameaddress.p0(i32 0)
+  call void @bar(ptr %0, ptr %var)
+  ret i32 %x
+}
diff --git a/llvm/test/CodeGen/RISCV/zcmp-with-float.ll b/llvm/test/CodeGen/RISCV/zcmp-with-float.ll
new file mode 100644
index 000000000000000..ba1b3984be17cfe
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/zcmp-with-float.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
+; RUN: llc -mtriple riscv32 -mattr=+f,+zcmp -target-abi ilp32f < %s | FileCheck %s --check-prefix=RV32
+; RUN: llc -mtriple riscv64 -mattr=+f,+zcmp -target-abi lp64f < %s | FileCheck %s --check-prefix=RV64
+
+define void @foo(i32 %x, float %y) {
+; RV32-LABEL: foo:
+; RV32:       # %bb.0: # %entry
+; RV32-NEXT:    cm.push {ra, s0}, -32
+; RV32-NEXT:    .cfi_def_cfa_offset 32
+; RV32-NEXT:    fsw fs0, 12(sp) # 4-byte Folded Spill
+; RV32-NEXT:    .cfi_offset ra, -8
+; RV32-NEXT:    .cfi_offset s0, -4
+; RV32-NEXT:    .cfi_offset fs0, -20
+; RV32-NEXT:    fmv.s fs0, fa0
+; RV32-NEXT:    mv s0, a0
+; RV32-NEXT:    call bar@plt
+; RV32-NEXT:    mv a0, s0
+; RV32-NEXT:    fmv.s fa0, fs0
+; RV32-NEXT:    flw fs0, 12(sp) # 4-byte Folded Reload
+; RV32-NEXT:    cm.pop {ra, s0}, 32
+; RV32-NEXT:    tail func@plt
+;
+; RV64-LABEL: foo:
+; RV64:       # %bb.0: # %entry
+; RV64-NEXT:    cm.push {ra, s0}, -32
+; RV64-NEXT:    .cfi_def_cfa_offset 32
+; RV64-NEXT:    fsw fs0, 12(sp) # 4-byte Folded Spill
+; RV64-NEXT:    .cfi_offset ra, -16
+; RV64-NEXT:    .cfi_offset s0, -8
+; RV64-NEXT:    .cfi_offset fs0, -20
+; RV64-NEXT:    fmv.s fs0, fa0
+; RV64-NEXT:    mv s0, a0
+; RV64-NEXT:    call bar@plt
+; RV64-NEXT:    mv a0, s0
+; RV64-NEXT:    fmv.s fa0, fs0
+; RV64-NEXT:    flw fs0, 12(sp) # 4-byte Folded Reload
+; RV64-NEXT:    cm.pop {ra, s0}, 32
+; RV64-NEXT:    tail func@plt
+entry:
+  tail call void @bar()
+  tail call void @func(i32 %x, float %y)
+  ret void
+}
+
+declare void @bar()
+declare void @func(i32, float)

Yeting Kuo added 2 commits September 20, 2023 15:39
This fixes two wrong offset uses,
1. .cfi_offset of callee saves are not pushed by cm.push.
2. Reference of frame objests by frame pointer.
@yetingk
Copy link
Contributor Author

yetingk commented Sep 20, 2023

Rebase and ping.

Copy link
Contributor

@tclin914 tclin914 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yetingk yetingk merged commit 7c70e50 into llvm:main Sep 25, 2023
@yetingk yetingk deleted the zcmp-fix2 branch January 2, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants