-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV] Save vector registers in interrupt handler. #143808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The generated code is pretty awful.
@llvm/pr-subscribers-backend-risc-v Author: Craig Topper (topperc) ChangesCorresponding gcc bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110665 The generated code is pretty awful. Patch is 339.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143808.diff 4 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVCallingConv.td b/llvm/lib/Target/RISCV/RISCVCallingConv.td
index 98e05b7f8eca7..0345a5c10a6fe 100644
--- a/llvm/lib/Target/RISCV/RISCVCallingConv.td
+++ b/llvm/lib/Target/RISCV/RISCVCallingConv.td
@@ -56,14 +56,44 @@ def CSR_XLEN_F32_Interrupt: CalleeSavedRegs<(add CSR_Interrupt,
def CSR_XLEN_F64_Interrupt: CalleeSavedRegs<(add CSR_Interrupt,
(sequence "F%u_D", 0, 31))>;
+// Same as CSR_Interrupt, but including all vector registers.
+def CSR_XLEN_V_Interrupt: CalleeSavedRegs<(add CSR_Interrupt,
+ (sequence "V%u", 0, 31))>;
+
+// Same as CSR_Interrupt, but including all 32-bit FP registers and all vector
+// registers.
+def CSR_XLEN_F32_V_Interrupt: CalleeSavedRegs<(add CSR_Interrupt,
+ (sequence "F%u_F", 0, 31),
+ (sequence "V%u", 0, 31))>;
+
+// Same as CSR_Interrupt, but including all 64-bit FP registers and all vector
+// registers.
+def CSR_XLEN_F64_V_Interrupt: CalleeSavedRegs<(add CSR_Interrupt,
+ (sequence "F%u_D", 0, 31),
+ (sequence "V%u", 0, 31))>;
+
// Same as CSR_Interrupt, but excluding X16-X31.
def CSR_Interrupt_RVE : CalleeSavedRegs<(sub CSR_Interrupt,
(sequence "X%u", 16, 31))>;
// Same as CSR_XLEN_F32_Interrupt, but excluding X16-X31.
def CSR_XLEN_F32_Interrupt_RVE: CalleeSavedRegs<(sub CSR_XLEN_F32_Interrupt,
- (sequence "X%u", 16, 31))>;
+ (sequence "X%u", 16, 31))>;
// Same as CSR_XLEN_F64_Interrupt, but excluding X16-X31.
def CSR_XLEN_F64_Interrupt_RVE: CalleeSavedRegs<(sub CSR_XLEN_F64_Interrupt,
- (sequence "X%u", 16, 31))>;
+ (sequence "X%u", 16, 31))>;
+
+// Same as CSR_XLEN_V_Interrupt, but excluding X16-X31.
+def CSR_XLEN_V_Interrupt_RVE: CalleeSavedRegs<(add CSR_Interrupt,
+ (sequence "V%u", 0, 31))>;
+
+// Same as CSR_XLEN_F32_V_Interrupt, but excluding X16-X31.
+def CSR_XLEN_F32_V_Interrupt_RVE: CalleeSavedRegs<(add CSR_Interrupt,
+ (sequence "F%u_F", 0, 31),
+ (sequence "V%u", 0, 31))>;
+
+// Same as CSR_XLEN_F64_V_Interrupt, but excluding X16-X31.
+def CSR_XLEN_F64_V_Interrupt_RVE: CalleeSavedRegs<(add CSR_Interrupt,
+ (sequence "F%u_D", 0, 31),
+ (sequence "V%u", 0, 31))>;
diff --git a/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp b/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
index 112142e1ef2f2..7fdbf4be1ed12 100644
--- a/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
@@ -69,6 +69,16 @@ RISCVRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
if (MF->getFunction().getCallingConv() == CallingConv::GHC)
return CSR_NoRegs_SaveList;
if (MF->getFunction().hasFnAttribute("interrupt")) {
+ if (Subtarget.hasVInstructions()) {
+ if (Subtarget.hasStdExtD())
+ return Subtarget.hasStdExtE() ? CSR_XLEN_F64_V_Interrupt_RVE_SaveList
+ : CSR_XLEN_F64_V_Interrupt_SaveList;
+ if (Subtarget.hasStdExtF())
+ return Subtarget.hasStdExtE() ? CSR_XLEN_F32_V_Interrupt_RVE_SaveList
+ : CSR_XLEN_F32_V_Interrupt_SaveList;
+ return Subtarget.hasStdExtE() ? CSR_XLEN_V_Interrupt_RVE_SaveList
+ : CSR_XLEN_V_Interrupt_SaveList;
+ }
if (Subtarget.hasStdExtD())
return Subtarget.hasStdExtE() ? CSR_XLEN_F64_Interrupt_RVE_SaveList
: CSR_XLEN_F64_Interrupt_SaveList;
diff --git a/llvm/test/CodeGen/RISCV/interrupt-attr.ll b/llvm/test/CodeGen/RISCV/interrupt-attr.ll
index ba20ba77e6b26..e278b8d0b53b2 100644
--- a/llvm/test/CodeGen/RISCV/interrupt-attr.ll
+++ b/llvm/test/CodeGen/RISCV/interrupt-attr.ll
@@ -19,6 +19,13 @@
; RUN: 2>&1 | FileCheck %s -check-prefixes=CHECK,CHECK-RV32E
; RUN: llc -mtriple riscv32-unknown-elf -mattr=+e,+f -o - %s \
; RUN: 2>&1 | FileCheck %s -check-prefixes=CHECK,CHECK-RV32E-F
+
+; RUN: llc -mtriple riscv32-unknown-elf -mattr=+zve32x -o - %s \
+; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV32-V
+; RUN: llc -mtriple riscv32-unknown-elf -mattr=+zve32x,+f -o - %s \
+; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV32-FV
+; RUN: llc -mtriple riscv32-unknown-elf -mattr=+zve32x,+f,+d -o - %s \
+; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV32-FDV
;
; RUN: llc -mtriple riscv64-unknown-elf -o - %s \
; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV64
@@ -42,6 +49,13 @@
; RUN: 2>&1 | FileCheck %s -check-prefixes=CHECK,CHECK-RV64E-F
; RUN: llc -mtriple riscv64-unknown-elf -mattr=+e,+f,+d -o - %s \
; RUN: 2>&1 | FileCheck %s -check-prefixes=CHECK,CHECK-RV64E-FD
+;
+; RUN: llc -mtriple riscv64-unknown-elf -mattr=+zve32x -o - %s \
+; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV64-V
+; RUN: llc -mtriple riscv64-unknown-elf -mattr=+zve32x,+f -o - %s \
+; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV64-FV
+; RUN: llc -mtriple riscv64-unknown-elf -mattr=+zve32x,+f,+d -o - %s \
+; RUN: 2>&1 | FileCheck %s -check-prefix CHECK -check-prefix CHECK-RV64-FDV
;
; Checking for special return instructions (sret, mret).
@@ -757,6 +771,1697 @@ define void @foo_with_call() #1 {
; CHECK-RV32E-F-NEXT: addi sp, sp, 168
; CHECK-RV32E-F-NEXT: mret
;
+; CHECK-RV32-V-LABEL: foo_with_call:
+; CHECK-RV32-V: # %bb.0:
+; CHECK-RV32-V-NEXT: addi sp, sp, -80
+; CHECK-RV32-V-NEXT: sw ra, 76(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t0, 72(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t1, 68(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t2, 64(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a0, 60(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a1, 56(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a2, 52(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a3, 48(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a4, 44(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a5, 40(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a6, 36(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw a7, 32(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t3, 28(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t4, 24(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t5, 20(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: sw t6, 16(sp) # 4-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 5
+; CHECK-RV32-V-NEXT: sub sp, sp, a0
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 5
+; CHECK-RV32-V-NEXT: sub a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v0, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v1, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v2, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v3, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v4, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v5, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 3
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v6, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 3
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v7, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v8, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v9, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v10, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v11, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 3
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v12, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 3
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v13, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 4
+; CHECK-RV32-V-NEXT: add a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v14, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 4
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v15, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 4
+; CHECK-RV32-V-NEXT: sub a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v16, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v17, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v18, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v19, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v20, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v21, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 3
+; CHECK-RV32-V-NEXT: add a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v22, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 3
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v23, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 3
+; CHECK-RV32-V-NEXT: sub a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v24, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v25, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 2
+; CHECK-RV32-V-NEXT: add a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v26, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v27, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v28, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v29, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vs1r.v v30, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: addi a0, sp, 16
+; CHECK-RV32-V-NEXT: vs1r.v v31, (a0) # vscale x 8-byte Folded Spill
+; CHECK-RV32-V-NEXT: call otherfoo
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a1, a0, 5
+; CHECK-RV32-V-NEXT: sub a0, a1, a0
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vl1r.v v0, (a0) # vscale x 8-byte Folded Reload
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vl1r.v v1, (a0) # vscale x 8-byte Folded Reload
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vl1r.v v2, (a0) # vscale x 8-byte Folded Reload
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vl1r.v v3, (a0) # vscale x 8-byte Folded Reload
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+; CHECK-RV32-V-NEXT: addi a0, a0, 16
+; CHECK-RV32-V-NEXT: vl1r.v v4, (a0) # vscale x 8-byte Folded Reload
+; CHECK-RV32-V-NEXT: csrr a0, vlenb
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: mv a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 2
+; CHECK-RV32-V-NEXT: add a1, a1, a0
+; CHECK-RV32-V-NEXT: slli a0, a0, 1
+; CHECK-RV32-V-NEXT: add a0, a0, a1
+; CHECK-RV32-V-NEXT: add a0, sp, a0
+;...
[truncated]
|
; CHECK-RV32-V-NEXT: slli a0, a0, 1 | ||
; CHECK-RV32-V-NEXT: add a0, a0, a1 | ||
; CHECK-RV32-V-NEXT: add a0, sp, a0 | ||
; CHECK-RV32-V-NEXT: addi a0, a0, 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many duplicated instructions, we should try to remove them as follow-ups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the current codegen suggests we need some kind of add scaled vlen + constant to GPR
instruction, but I doubt that exists.
// registers. | ||
def CSR_XLEN_F64_V_Interrupt: CalleeSavedRegs<(add CSR_Interrupt, | ||
(sequence "F%u_D", 0, 31), | ||
(sequence "V%u", 0, 31))>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use LMUL8 registers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but it caused us to spill an LMUL=8 register even if only 1 LMUL=1 register was used. Not sure if we should be optimizing for number of instructions or amount of stack space required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have questions about what we do for the callee saved registers for vector calling convention now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @4vtomat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In vector callee saved registers we just put every combination of vector register class we didn't do anything special lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think callee-saved register also have this issue when I change
llvm-project/llvm/test/CodeGen/RISCV/rvv/callee-saved-regs.ll
Lines 89 to 90 in 4f60321
call void asm sideeffect "", | |
"~{v0},~{v1},~{v2},~{v3},~{v4},~{v5},~{v6},~{v7},~{v8},~{v9},~{v10},~{v11},~{v12},~{v13},~{v14},~{v15},~{v16},~{v17},~{v18},~{v19},~{v20},~{v21},~{v22},~{v23},~{v24},~{v25},~{v26},~{v27},~{v28},~{v29},~{v30},~{v31}"() |
v24
, it uses vs8r
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the register is set if any of it's alias(including sub-registers and super-registers) is used, so in this case if v24
is clobbered, all of super-registers of v24
in callee-saved lists including v24
, v24m2
, v24m4
and v24m8
is set in SavedRegs
.
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp#L145-L146
Ping |
def CSR_XLEN_V_Interrupt_RVE: CalleeSavedRegs<(add CSR_Interrupt, | ||
(sequence "V%u", 0, 31))>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look correct? I think it should be this (and similar changes in the 2 instances below)
def CSR_XLEN_V_Interrupt_RVE: CalleeSavedRegs<(add CSR_Interrupt, | |
(sequence "V%u", 0, 31))>; | |
def CSR_XLEN_V_Interrupt_RVE: CalleeSavedRegs<(add CSR_Interrupt_RVE, | |
(sequence "V%u", 0, 31))>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. Unfortunately, vectors+RVE crashes so I couldn't test it.
Corresponding gcc bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110665
The generated code is pretty awful.