-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[StackFrameLayoutAnalysis] Add basic Scalable stack slot output #99883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The existing StackFrameLayoutAnalysis details do not do well with Scalable vector stack slots, which are not marked as scalable and intertwined with the other fixed-size slots. This patch adds some very basic support, marking them as scalable and sorting them to the end of the list. The slot addresses are not really correct (for fixed as well as scalable), but this prints something a little better with the limited information curently available.
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesThe existing StackFrameLayoutAnalysis details do not do well with Scalable vector stack slots, which are not marked as scalable and intertwined with the other fixed-size slots. This patch adds some very basic support, marking them as scalable and sorting them to the end of the list. The slot addresses are not really correct (for fixed as well as scalable), but this prints something a little better with the limited information curently available. Full diff: https://github.com/llvm/llvm-project/pull/99883.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp b/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
index 5d3903ed84ce8..940aecd1cb363 100644
--- a/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
+++ b/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
@@ -62,11 +62,14 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
int Align;
int Offset;
SlotType SlotTy;
+ bool Scalable;
SlotData(const MachineFrameInfo &MFI, const int ValOffset, const int Idx)
: Slot(Idx), Size(MFI.getObjectSize(Idx)),
Align(MFI.getObjectAlign(Idx).value()),
- Offset(MFI.getObjectOffset(Idx) - ValOffset), SlotTy(Invalid) {
+ Offset(MFI.getObjectOffset(Idx) - ValOffset), SlotTy(Invalid),
+ Scalable(false) {
+ Scalable = MFI.getStackID(Idx) == TargetStackID::ScalableVector;
if (MFI.isSpillSlotObjectIndex(Idx))
SlotTy = SlotType::Spill;
else if (Idx == MFI.getStackProtectorIndex())
@@ -75,9 +78,12 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
SlotTy = SlotType::Variable;
}
- // we use this to sort in reverse order, so that the layout is displayed
- // correctly
- bool operator<(const SlotData &Rhs) const { return Offset > Rhs.Offset; }
+ // We use this to sort in reverse order, so that the layout is displayed
+ // correctly. Scalable slots are sorted to the end of the list.
+ bool operator<(const SlotData &Rhs) const {
+ return std::make_tuple(!Scalable, Offset) >
+ std::make_tuple(!Rhs.Scalable, Rhs.Offset);
+ }
};
StackFrameLayoutAnalysisPass() : MachineFunctionPass(ID) {}
@@ -153,7 +159,7 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
Rem << Prefix << ore::NV("Offset", D.Offset)
<< "], Type: " << ore::NV("Type", getTypeString(D.SlotTy))
<< ", Align: " << ore::NV("Align", D.Align)
- << ", Size: " << ore::NV("Size", D.Size);
+ << ", Size: " << ore::NV("Size", ElementCount::get(D.Size, D.Scalable));
}
void emitSourceLocRemark(const MachineFunction &MF, const DILocalVariable *N,
diff --git a/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll b/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
index 67c1486b7a8c0..34d85d1f76086 100644
--- a/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
+++ b/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
@@ -4,10 +4,10 @@
; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Variable, Align: 16, Size: 16
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20], Type: Variable, Align: 4, Size: 4
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Variable, Align: 16, Size: vscale x 16
define i32 @csr_d8_allocnxv4i32i32f64(double %d) "aarch64_pstate_sm_compatible" {
; CHECK-LABEL: csr_d8_allocnxv4i32i32f64:
@@ -46,11 +46,11 @@ entry:
; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64_fp
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Variable, Align: 16, Size: 16
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20], Type: Variable, Align: 4, Size: 4
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 16, Size: 8
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40], Type: Variable, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Variable, Align: 16, Size: vscale x 16
define i32 @csr_d8_allocnxv4i32i32f64_fp(double %d) "aarch64_pstate_sm_compatible" "frame-pointer"="all" {
; CHECK-LABEL: csr_d8_allocnxv4i32i32f64_fp:
@@ -92,11 +92,11 @@ entry:
; CHECK-FRAMELAYOUT-LABEL: Function: svecc_z8_allocnxv4i32i32f64_fp
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 16, Size: 16
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20], Type: Variable, Align: 4, Size: 4
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 16, Size: 16
; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 16, Size: vscale x 16
define i32 @svecc_z8_allocnxv4i32i32f64_fp(double %d, <vscale x 4 x i32> %v) "aarch64_pstate_sm_compatible" "frame-pointer"="all" {
; CHECK-LABEL: svecc_z8_allocnxv4i32i32f64_fp:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for catching this, I hadn’t realized we were handling scalable vector differently. I’ll see if I can find a way to sort out the offsets being incorrect.
Thanks folks. I'd be interested if there was a good way to handle this. Unfortunately we might need to invent something. The AArch64 stack frame can have scalable and fixed-width objects combined in its layout, where some parts are fp-based and others are sp. I don't know if there is a way at the moment to get the real offset and was wondering what the best way to add it might be. |
Yeah, I think this will probably take some surgery to work out. I know the DWARF for RISC-V vector spills is pretty complicated, too, so I doubt it will be simple to fix. I’m probably going to have to rethink a few assumptions this pass made. I wonder if BOLT has some tricks for this kind of thing? Anyway, lots to think about. |
The test I pre-committed for this earlier is acting differently on different bots due to the sort order of the objects with identical offsets not being deterministic. Hopefully this should fix it as the objects will have different orders now. |
Summary: The existing StackFrameLayoutAnalysis details do not do well with Scalable vector stack slots, which are not marked as scalable and intertwined with the other fixed-size slots. This patch adds some very basic support, marking them as scalable and sorting them to the end of the list. The slot addresses are not really correct (for fixed as well as scalable), but this prints something a little better with the limited information curently available. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251075
The existing StackFrameLayoutAnalysis details do not do well with Scalable vector stack slots, which are not marked as scalable and intertwined with the other fixed-size slots. This patch adds some very basic support, marking them as scalable and sorting them to the end of the list. The slot addresses are not really correct (for fixed as well as scalable), but this prints something a little better with the limited information curently available.