-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CodeGen] Compute call frame sizes instead of storing them in MachineBBs #106964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CodeGen] Compute call frame sizes instead of storing them in MachineBBs #106964
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @ritter-x2a and the rest of your teammates on |
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-backend-arm Author: Fabian Ritter (ritter-x2a) ChangesCall frames can be zero-sized. Distinguishing between instructions in a This patch uses std::optional<unsigned> instead of unsigned to represent Full diff: https://github.com/llvm/llvm-project/pull/106964.diff 14 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index 6efb17c55493a9..cbc2c66d8bb5d8 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -146,12 +146,13 @@ class MachineBasicBlock
int Number;
/// The call frame size on entry to this basic block due to call frame setup
- /// instructions in a predecessor. This is usually zero, unless basic blocks
- /// are split in the middle of a call sequence.
+ /// instructions in a predecessor. Usually does not contain a value, unless
+ /// basic blocks are split in the middle of a call sequence. Zero-sized call
+ /// frames are possible.
///
/// This information is only maintained until PrologEpilogInserter eliminates
/// call frame pseudos.
- unsigned CallFrameSize = 0;
+ std::optional<unsigned> CallFrameSize;
MachineFunction *xParent;
Instructions Insts;
@@ -1210,9 +1211,17 @@ class MachineBasicBlock
void setNumber(int N) { Number = N; }
/// Return the call frame size on entry to this basic block.
- unsigned getCallFrameSize() const { return CallFrameSize; }
+ std::optional<unsigned> getCallFrameSize() const { return CallFrameSize; }
+
+ /// Return the call frame size on entry to this basic block if it is part of a
+ /// call sequence and 0 otherwise.
+ unsigned getCallFrameSizeOrZero() const { return CallFrameSize.value_or(0); }
+
/// Set the call frame size on entry to this basic block.
- void setCallFrameSize(unsigned N) { CallFrameSize = N; }
+ void setCallFrameSize(std::optional<unsigned> N) { CallFrameSize = N; }
+
+ /// Reset the stored call frame size.
+ void clearCallFrameSize() { CallFrameSize.reset(); }
/// Return the MCSymbol for this basic block.
MCSymbol *getSymbol() const;
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 49ce13dd8cbe39..0ae3a0ca53757a 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2265,8 +2265,9 @@ class TargetInstrInfo : public MCInstrInfo {
return false;
}
- // Get the call frame size just before MI.
- unsigned getCallFrameSizeAt(MachineInstr &MI) const;
+ /// Get the call frame size just before MI. Contains no value if MI is not in
+ /// a call sequence. Zero-sized call frames are possible.
+ std::optional<unsigned> getCallFrameSizeAt(MachineInstr &MI) const;
/// Fills in the necessary MachineOperands to refer to a frame index.
/// The best way to understand this is to print `asm(""::"m"(x));` after
diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index 47b220172602d4..4295956b2a5cc9 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -502,7 +502,7 @@ class MIParser {
bool parseAddrspace(unsigned &Addrspace);
bool parseSectionID(std::optional<MBBSectionID> &SID);
bool parseBBID(std::optional<UniqueBBID> &BBID);
- bool parseCallFrameSize(unsigned &CallFrameSize);
+ bool parseCallFrameSize(std::optional<unsigned> &CallFrameSize);
bool parseOperandsOffset(MachineOperand &Op);
bool parseIRValue(const Value *&V);
bool parseMemoryOperandFlag(MachineMemOperand::Flags &Flags);
@@ -685,7 +685,7 @@ bool MIParser::parseBBID(std::optional<UniqueBBID> &BBID) {
}
// Parse basic block call frame size.
-bool MIParser::parseCallFrameSize(unsigned &CallFrameSize) {
+bool MIParser::parseCallFrameSize(std::optional<unsigned> &CallFrameSize) {
assert(Token.is(MIToken::kw_call_frame_size));
lex();
unsigned Value = 0;
@@ -713,7 +713,7 @@ bool MIParser::parseBasicBlockDefinition(
std::optional<MBBSectionID> SectionID;
uint64_t Alignment = 0;
std::optional<UniqueBBID> BBID;
- unsigned CallFrameSize = 0;
+ std::optional<unsigned> CallFrameSize;
BasicBlock *BB = nullptr;
if (consumeIfPresent(MIToken::lparen)) {
do {
diff --git a/llvm/lib/CodeGen/MachineBasicBlock.cpp b/llvm/lib/CodeGen/MachineBasicBlock.cpp
index 5d06af3ebf3360..a948efb41d7152 100644
--- a/llvm/lib/CodeGen/MachineBasicBlock.cpp
+++ b/llvm/lib/CodeGen/MachineBasicBlock.cpp
@@ -578,9 +578,9 @@ void MachineBasicBlock::printName(raw_ostream &os, unsigned printNameFlags,
os << " " << getBBID()->CloneID;
hasAttributes = true;
}
- if (CallFrameSize != 0) {
+ if (CallFrameSize.has_value()) {
os << (hasAttributes ? ", " : " (");
- os << "call-frame-size " << CallFrameSize;
+ os << "call-frame-size " << CallFrameSize.value();
hasAttributes = true;
}
}
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index 759201ed9dadc7..62ece9ef54b58b 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -3837,11 +3837,11 @@ void MachineVerifier::verifyStackFrame() {
BBState.ExitIsSetup = BBState.EntryIsSetup;
}
- if ((int)MBB->getCallFrameSize() != -BBState.EntryValue) {
+ if ((int)MBB->getCallFrameSizeOrZero() != -BBState.EntryValue) {
report("Call frame size on entry does not match value computed from "
"predecessor",
MBB);
- errs() << "Call frame size on entry " << MBB->getCallFrameSize()
+ errs() << "Call frame size on entry " << MBB->getCallFrameSizeOrZero()
<< " does not match value computed from predecessor "
<< -BBState.EntryValue << '\n';
}
diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index ee03eaa8ae527c..b2f355c0aa8ccf 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -392,9 +392,9 @@ void PEI::calculateCallFrameInfo(MachineFunction &MF) {
TFI->eliminateCallFramePseudoInstr(MF, *I->getParent(), I);
// We can't track the call frame size after call frame pseudos have been
- // eliminated. Set it to zero everywhere to keep MachineVerifier happy.
+ // eliminated. Clear it everywhere to keep MachineVerifier happy.
for (MachineBasicBlock &MBB : MF)
- MBB.setCallFrameSize(0);
+ MBB.clearCallFrameSize();
}
}
@@ -1350,11 +1350,11 @@ void PEI::replaceFrameIndicesBackward(MachineFunction &MF) {
// Get the SP adjustment for the end of MBB from the start of any of its
// successors. They should all be the same.
assert(all_of(MBB.successors(), [&MBB](const MachineBasicBlock *Succ) {
- return Succ->getCallFrameSize() ==
- (*MBB.succ_begin())->getCallFrameSize();
+ return Succ->getCallFrameSizeOrZero() ==
+ (*MBB.succ_begin())->getCallFrameSizeOrZero();
}));
const MachineBasicBlock &FirstSucc = **MBB.succ_begin();
- SPAdj = TFI.alignSPAdjust(FirstSucc.getCallFrameSize());
+ SPAdj = TFI.alignSPAdjust(FirstSucc.getCallFrameSizeOrZero());
if (TFI.getStackGrowthDirection() == TargetFrameLowering::StackGrowsUp)
SPAdj = -SPAdj;
}
@@ -1362,8 +1362,8 @@ void PEI::replaceFrameIndicesBackward(MachineFunction &MF) {
replaceFrameIndicesBackward(&MBB, MF, SPAdj);
// We can't track the call frame size after call frame pseudos have been
- // eliminated. Set it to zero everywhere to keep MachineVerifier happy.
- MBB.setCallFrameSize(0);
+ // eliminated. Clear it everywhere to keep MachineVerifier happy.
+ MBB.clearCallFrameSize();
}
}
@@ -1373,15 +1373,15 @@ void PEI::replaceFrameIndices(MachineFunction &MF) {
const TargetFrameLowering &TFI = *MF.getSubtarget().getFrameLowering();
for (auto &MBB : MF) {
- int SPAdj = TFI.alignSPAdjust(MBB.getCallFrameSize());
+ int SPAdj = TFI.alignSPAdjust(MBB.getCallFrameSizeOrZero());
if (TFI.getStackGrowthDirection() == TargetFrameLowering::StackGrowsUp)
SPAdj = -SPAdj;
replaceFrameIndices(&MBB, MF, SPAdj);
// We can't track the call frame size after call frame pseudos have been
- // eliminated. Set it to zero everywhere to keep MachineVerifier happy.
- MBB.setCallFrameSize(0);
+ // eliminated. Clear it everywhere to keep MachineVerifier happy.
+ MBB.clearCallFrameSize();
}
}
diff --git a/llvm/lib/CodeGen/TargetInstrInfo.cpp b/llvm/lib/CodeGen/TargetInstrInfo.cpp
index 38bd0b0ba4114c..f623ba0352de8b 100644
--- a/llvm/lib/CodeGen/TargetInstrInfo.cpp
+++ b/llvm/lib/CodeGen/TargetInstrInfo.cpp
@@ -1624,15 +1624,15 @@ TargetInstrInfo::describeLoadedValue(const MachineInstr &MI,
return std::nullopt;
}
-// Get the call frame size just before MI.
-unsigned TargetInstrInfo::getCallFrameSizeAt(MachineInstr &MI) const {
+std::optional<unsigned>
+TargetInstrInfo::getCallFrameSizeAt(MachineInstr &MI) const {
// Search backwards from MI for the most recent call frame instruction.
MachineBasicBlock *MBB = MI.getParent();
for (auto &AdjI : reverse(make_range(MBB->instr_begin(), MI.getIterator()))) {
if (AdjI.getOpcode() == getCallFrameSetupOpcode())
return getFrameTotalSize(AdjI);
if (AdjI.getOpcode() == getCallFrameDestroyOpcode())
- return 0;
+ return std::nullopt;
}
// If none was found, use the call frame size from the start of the basic
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 9096617a948557..e2c57d8b9ac524 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -11594,7 +11594,7 @@ ARMTargetLowering::EmitStructByval(MachineInstr &MI,
MF->insert(It, exitMBB);
// Set the call frame size on entry to the new basic blocks.
- unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+ std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
loopMBB->setCallFrameSize(CallFrameSize);
exitMBB->setCallFrameSize(CallFrameSize);
@@ -12195,7 +12195,7 @@ ARMTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
F->insert(It, sinkMBB);
// Set the call frame size on entry to the new basic blocks.
- unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+ std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
copy0MBB->setCallFrameSize(CallFrameSize);
sinkMBB->setCallFrameSize(CallFrameSize);
diff --git a/llvm/lib/Target/AVR/AVRISelLowering.cpp b/llvm/lib/Target/AVR/AVRISelLowering.cpp
index 0046e757f4efa2..cf6f026869da80 100644
--- a/llvm/lib/Target/AVR/AVRISelLowering.cpp
+++ b/llvm/lib/Target/AVR/AVRISelLowering.cpp
@@ -2424,7 +2424,7 @@ AVRTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
MF->insert(I, falseMBB);
// Set the call frame size on entry to the new basic blocks.
- unsigned CallFrameSize = TII.getCallFrameSizeAt(MI);
+ std::optional<unsigned> CallFrameSize = TII.getCallFrameSizeAt(MI);
trueMBB->setCallFrameSize(CallFrameSize);
falseMBB->setCallFrameSize(CallFrameSize);
diff --git a/llvm/lib/Target/M68k/M68kISelLowering.cpp b/llvm/lib/Target/M68k/M68kISelLowering.cpp
index 316a6eebc2db0f..3b0a0c83d94d16 100644
--- a/llvm/lib/Target/M68k/M68kISelLowering.cpp
+++ b/llvm/lib/Target/M68k/M68kISelLowering.cpp
@@ -3199,7 +3199,7 @@ M68kTargetLowering::EmitLoweredSelect(MachineInstr &MI,
F->insert(It, SinkMBB);
// Set the call frame size on entry to the new basic blocks.
- unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+ std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
Copy0MBB->setCallFrameSize(CallFrameSize);
SinkMBB->setCallFrameSize(CallFrameSize);
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 8a0858e2462520..f56443770095c7 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -13031,7 +13031,7 @@ PPCTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
// Set the call frame size on entry to the new basic blocks.
// See https://reviews.llvm.org/D156113.
- unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+ std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
copy0MBB->setCallFrameSize(CallFrameSize);
sinkMBB->setCallFrameSize(CallFrameSize);
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d02078372b24a2..7fcc624e834b9e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -18444,7 +18444,8 @@ static MachineBasicBlock *emitSelectPseudo(MachineInstr &MI,
F->insert(I, TailMBB);
// Set the call frame size on entry to the new basic blocks.
- unsigned CallFrameSize = TII.getCallFrameSizeAt(*LastSelectPseudo);
+ std::optional<unsigned> CallFrameSize =
+ TII.getCallFrameSizeAt(*LastSelectPseudo);
IfFalseMBB->setCallFrameSize(CallFrameSize);
TailMBB->setCallFrameSize(CallFrameSize);
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index f011249d295040..bbee0af109c74b 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -35264,7 +35264,7 @@ X86TargetLowering::EmitLoweredSelect(MachineInstr &MI,
F->insert(It, SinkMBB);
// Set the call frame size on entry to the new basic blocks.
- unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+ std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
FalseMBB->setCallFrameSize(CallFrameSize);
SinkMBB->setCallFrameSize(CallFrameSize);
diff --git a/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir b/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir
index d3e87120693cf3..58e7f50e583455 100644
--- a/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir
+++ b/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir
@@ -23,3 +23,25 @@ body: |
ADJCALLSTACKUP 100, 0, 14, $noreg, implicit-def $sp, implicit $sp
bb.2:
...
+
+---
+name: callframesizezero
+body: |
+ ; CHECK-LABEL: name: callframesizezero
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, 14 /* CC::al */, $noreg, implicit-def $sp, implicit $sp
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1 (call-frame-size 0):
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, 14 /* CC::al */, $noreg, implicit-def $sp, implicit $sp
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ bb.0:
+ ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
+ bb.1 (call-frame-size 0):
+ ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
+ bb.2:
+...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs
Do you have an example of where this is useful?
llvm/lib/CodeGen/MachineVerifier.cpp
Outdated
@@ -3837,11 +3837,11 @@ void MachineVerifier::verifyStackFrame() { | |||
BBState.ExitIsSetup = BBState.EntryIsSetup; | |||
} | |||
|
|||
if ((int)MBB->getCallFrameSize() != -BBState.EntryValue) { | |||
if ((int)MBB->getCallFrameSizeOrZero() != -BBState.EntryValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you improve the verification here by making EntryValue/ExitValue be std::optionals too, and using the to track the presence/absence of a call frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When implementing this, I noticed cases where MachineBBs are inserted but their call frame size is not set, e.g., in X86FrameLowering and in AMDGPURegisterBankInfo.
The call frames in the tests that fail because of that with my adjusted MachineVerifier (one X86 and one AMDGPU) happen to be zero-sized, so the current implementation does not find an issue.
Do you recall if there was a reason why the call frame sizes for such MachineBBs introduced after ISel are not set in the original patch for storing the call frame size in MachineBasicBlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no good reason. It was just an oversight. If no tests failed, then I would not have noticed that there was a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'll look for other such cases then and see if I can fix them. Thanks for the insight!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR to a version with the adjusted MachineVerifier and fixes for the lit tests that failed with it. However, it's difficult to say whether this captures all cases where a CALLSEQ might be split across MachineBBs. For instance, I found 126 references (in 62 files) to MachineBasicBlock::splice; I think we'd have to handle them (and other cases, e.g., where new MachineBBs are created) or check that they can't affect CALLSEQs to be really sure we caught everything.
An alternative would be to drop the stored call frame size from the MachineBBs and compute up-to-date call frame sizes only when they are needed, i.e., in the PrologueEpilogueInserter and where I use it in #106965.
What's your opinion on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be to drop the stored call frame size from the MachineBBs and compute up-to-date call frame sizes only when they are needed, i.e., in the PrologueEpilogueInserter
The whole point of storing the frame size in MachineBB
is so that PEI
does not have to do a complete forward walk over the function to compute frame sizes, in addition to the backward walk that it does to eliminate frame indices. My assumption was that the forward walk would be expensive, but if you can do it without angering https://llvm-compile-time-tracker.com/ then go for it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayfoad I just updated this PR to (re)compute the call frame sizes on demand instead of storing them. The compile-time overhead seems to be relatively minor (see the CT report of the entire PR stack here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping, @jayfoad
Yes, PR #106965 (the second one in this stacked PR) relies on this for identifying locations that are inside any call frame (even zero-sized ones) to avoid nested CALLSEQs when thread-local-storage addresses are passed as a function argument. |
/// Call frame sizes for the MachineFunction's MachineBasicBlocks. This is set | ||
/// by the MachineFrameSizeInfo constructor and cleared by its destructor. | ||
MachineFrameSizeInfo *SizeInfo = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this type of ownership makes sense. MachineFrameInfo should be a compilation wide state as the function is processed. This shouldn't be set for one function in one pass that happens to compute it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation for that was to make passing this info to X86TargetLowering::EmitLoweredTLSAddr
less obtrusive in #106965. For this PR here, it's certainly not necessary. I can move this change over to the other PR for now and then we can discuss there how else to pass it to the function (or decide on a different way to avoid the nested CALLSEQs for TLSAddrs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in latest commit.
…chineBB Call frames can be zero-sized. Distinguishing between instructions in a zero-sized call frame and instructions outside of call frames via the call frame size info stored in the MachineBBs is helpful to avoid nested CALLSEQs (which are illegal and cause machine verifier errors). This patch uses std::optional<unsigned> instead of unsigned to represent call frame sizes. Locations outside of call frames are marked with std::nullopt whereas locations in zero-sized call-frames are represented with a std::optional that contains 0.
…om no call frames.
…ks was not set or updated.
Previously, we would need to update stored call frame sizes whenever a MachineBB with a call sequence is split or call frame instructions are moved. By recomputing them instead, this change avoids the need for keeping track of the call frame sizes throughout transformations. It also fixes several cases where updates of call frame sizes were missing. The overhead according to the compile time tracker is < +0.03% executed instructions on all CT benchmarks. This also allows distinguishing a zero-sized call frame from no open call frame, which is helpful to avoid nested CALLSEQs (which are illegal and cause machine verifier errors).
Also remove an unintended newline.
20aa681
to
7020475
Compare
Ping. I pulled in updates from main and updated the description and title of this PR, since it did have a bit of a shift in subject. Let me know if you'd prefer me to open a new PR for computing instead of storing call frame sizes instead. |
With PR #106965 now made obsolete by the merged #113706, this PR is no longer necessary to achieve the goal I was after. (Re)-computing instead of storing the call frame sizes (which this PR does) might however still be worthwhile to avoid errors caused by missed updates of call frame sizes. |
I'm ambivalent. Recomputing frame sizes instead of storing them seems like a cleaner approach, but it also seems like wasted effort in PEI to do a complete walk of the CFG just to recompute them. |
Then let's shelve this PR for now. |
Previously, we would need to update stored call frame sizes whenever a MachineBB with a call sequence is split or call frame instructions are moved. By recomputing them instead, this change avoids the need for keeping track of the call frame sizes throughout transformations. It also fixes several cases where updates of call frame sizes were missing.
The overhead according to the compile time tracker is < +0.03% executed instructions on all CT benchmark categories.
This also allows distinguishing a zero-sized call frame from no open call frame, which is helpful to avoid nested CALLSEQs (which are illegal and cause machine verifier errors).
Original Description, under the title "[CodeGen] Distinguish zero-sized call frames from no call frame in MachineBB":
Call frames can be zero-sized. Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs (which are illegal and cause machine verifier errors).
This patch uses std::optional instead of unsigned to represent
call frame sizes. Locations outside of call frames are marked with
std::nullopt whereas locations in zero-sized call-frames are represented
with a std::optional that contains 0.