Skip to content

[CodeGen] Compute call frame sizes instead of storing them in MachineBBs #106964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ritter-x2a
Copy link
Member

@ritter-x2a ritter-x2a commented Sep 2, 2024

Previously, we would need to update stored call frame sizes whenever a MachineBB with a call sequence is split or call frame instructions are moved. By recomputing them instead, this change avoids the need for keeping track of the call frame sizes throughout transformations. It also fixes several cases where updates of call frame sizes were missing.

The overhead according to the compile time tracker is < +0.03% executed instructions on all CT benchmark categories.

This also allows distinguishing a zero-sized call frame from no open call frame, which is helpful to avoid nested CALLSEQs (which are illegal and cause machine verifier errors).


Original Description, under the title "[CodeGen] Distinguish zero-sized call frames from no call frame in MachineBB":
Call frames can be zero-sized. Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs (which are illegal and cause machine verifier errors).

This patch uses std::optional instead of unsigned to represent
call frame sizes. Locations outside of call frames are marked with
std::nullopt whereas locations in zero-sized call-frames are represented
with a std::optional that contains 0.

Copy link
Member Author

@llvmbot
Copy link
Member

llvmbot commented Sep 2, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-m68k

@llvm/pr-subscribers-backend-arm

Author: Fabian Ritter (ritter-x2a)

Changes

Call frames can be zero-sized. Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs (which are illegal and cause machine verifier errors).

This patch uses std::optional<unsigned> instead of unsigned to represent
call frame sizes. Locations outside of call frames are marked with
std::nullopt whereas locations in zero-sized call-frames are represented
with a std::optional that contains 0.


Full diff: https://github.com/llvm/llvm-project/pull/106964.diff

14 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+14-5)
  • (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+3-2)
  • (modified) llvm/lib/CodeGen/MIRParser/MIParser.cpp (+3-3)
  • (modified) llvm/lib/CodeGen/MachineBasicBlock.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/MachineVerifier.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/PrologEpilogInserter.cpp (+10-10)
  • (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+3-3)
  • (modified) llvm/lib/Target/ARM/ARMISelLowering.cpp (+2-2)
  • (modified) llvm/lib/Target/AVR/AVRISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/M68k/M68kISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+2-1)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+1-1)
  • (modified) llvm/test/CodeGen/MIR/ARM/call-frame-size.mir (+22)
diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index 6efb17c55493a9..cbc2c66d8bb5d8 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -146,12 +146,13 @@ class MachineBasicBlock
   int Number;
 
   /// The call frame size on entry to this basic block due to call frame setup
-  /// instructions in a predecessor. This is usually zero, unless basic blocks
-  /// are split in the middle of a call sequence.
+  /// instructions in a predecessor. Usually does not contain a value, unless
+  /// basic blocks are split in the middle of a call sequence. Zero-sized call
+  /// frames are possible.
   ///
   /// This information is only maintained until PrologEpilogInserter eliminates
   /// call frame pseudos.
-  unsigned CallFrameSize = 0;
+  std::optional<unsigned> CallFrameSize;
 
   MachineFunction *xParent;
   Instructions Insts;
@@ -1210,9 +1211,17 @@ class MachineBasicBlock
   void setNumber(int N) { Number = N; }
 
   /// Return the call frame size on entry to this basic block.
-  unsigned getCallFrameSize() const { return CallFrameSize; }
+  std::optional<unsigned> getCallFrameSize() const { return CallFrameSize; }
+
+  /// Return the call frame size on entry to this basic block if it is part of a
+  /// call sequence and 0 otherwise.
+  unsigned getCallFrameSizeOrZero() const { return CallFrameSize.value_or(0); }
+
   /// Set the call frame size on entry to this basic block.
-  void setCallFrameSize(unsigned N) { CallFrameSize = N; }
+  void setCallFrameSize(std::optional<unsigned> N) { CallFrameSize = N; }
+
+  /// Reset the stored call frame size.
+  void clearCallFrameSize() { CallFrameSize.reset(); }
 
   /// Return the MCSymbol for this basic block.
   MCSymbol *getSymbol() const;
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 49ce13dd8cbe39..0ae3a0ca53757a 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2265,8 +2265,9 @@ class TargetInstrInfo : public MCInstrInfo {
     return false;
   }
 
-  // Get the call frame size just before MI.
-  unsigned getCallFrameSizeAt(MachineInstr &MI) const;
+  /// Get the call frame size just before MI. Contains no value if MI is not in
+  /// a call sequence. Zero-sized call frames are possible.
+  std::optional<unsigned> getCallFrameSizeAt(MachineInstr &MI) const;
 
   /// Fills in the necessary MachineOperands to refer to a frame index.
   /// The best way to understand this is to print `asm(""::"m"(x));` after
diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index 47b220172602d4..4295956b2a5cc9 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -502,7 +502,7 @@ class MIParser {
   bool parseAddrspace(unsigned &Addrspace);
   bool parseSectionID(std::optional<MBBSectionID> &SID);
   bool parseBBID(std::optional<UniqueBBID> &BBID);
-  bool parseCallFrameSize(unsigned &CallFrameSize);
+  bool parseCallFrameSize(std::optional<unsigned> &CallFrameSize);
   bool parseOperandsOffset(MachineOperand &Op);
   bool parseIRValue(const Value *&V);
   bool parseMemoryOperandFlag(MachineMemOperand::Flags &Flags);
@@ -685,7 +685,7 @@ bool MIParser::parseBBID(std::optional<UniqueBBID> &BBID) {
 }
 
 // Parse basic block call frame size.
-bool MIParser::parseCallFrameSize(unsigned &CallFrameSize) {
+bool MIParser::parseCallFrameSize(std::optional<unsigned> &CallFrameSize) {
   assert(Token.is(MIToken::kw_call_frame_size));
   lex();
   unsigned Value = 0;
@@ -713,7 +713,7 @@ bool MIParser::parseBasicBlockDefinition(
   std::optional<MBBSectionID> SectionID;
   uint64_t Alignment = 0;
   std::optional<UniqueBBID> BBID;
-  unsigned CallFrameSize = 0;
+  std::optional<unsigned> CallFrameSize;
   BasicBlock *BB = nullptr;
   if (consumeIfPresent(MIToken::lparen)) {
     do {
diff --git a/llvm/lib/CodeGen/MachineBasicBlock.cpp b/llvm/lib/CodeGen/MachineBasicBlock.cpp
index 5d06af3ebf3360..a948efb41d7152 100644
--- a/llvm/lib/CodeGen/MachineBasicBlock.cpp
+++ b/llvm/lib/CodeGen/MachineBasicBlock.cpp
@@ -578,9 +578,9 @@ void MachineBasicBlock::printName(raw_ostream &os, unsigned printNameFlags,
         os << " " << getBBID()->CloneID;
       hasAttributes = true;
     }
-    if (CallFrameSize != 0) {
+    if (CallFrameSize.has_value()) {
       os << (hasAttributes ? ", " : " (");
-      os << "call-frame-size " << CallFrameSize;
+      os << "call-frame-size " << CallFrameSize.value();
       hasAttributes = true;
     }
   }
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index 759201ed9dadc7..62ece9ef54b58b 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -3837,11 +3837,11 @@ void MachineVerifier::verifyStackFrame() {
       BBState.ExitIsSetup = BBState.EntryIsSetup;
     }
 
-    if ((int)MBB->getCallFrameSize() != -BBState.EntryValue) {
+    if ((int)MBB->getCallFrameSizeOrZero() != -BBState.EntryValue) {
       report("Call frame size on entry does not match value computed from "
              "predecessor",
              MBB);
-      errs() << "Call frame size on entry " << MBB->getCallFrameSize()
+      errs() << "Call frame size on entry " << MBB->getCallFrameSizeOrZero()
              << " does not match value computed from predecessor "
              << -BBState.EntryValue << '\n';
     }
diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index ee03eaa8ae527c..b2f355c0aa8ccf 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -392,9 +392,9 @@ void PEI::calculateCallFrameInfo(MachineFunction &MF) {
       TFI->eliminateCallFramePseudoInstr(MF, *I->getParent(), I);
 
     // We can't track the call frame size after call frame pseudos have been
-    // eliminated. Set it to zero everywhere to keep MachineVerifier happy.
+    // eliminated. Clear it everywhere to keep MachineVerifier happy.
     for (MachineBasicBlock &MBB : MF)
-      MBB.setCallFrameSize(0);
+      MBB.clearCallFrameSize();
   }
 }
 
@@ -1350,11 +1350,11 @@ void PEI::replaceFrameIndicesBackward(MachineFunction &MF) {
       // Get the SP adjustment for the end of MBB from the start of any of its
       // successors. They should all be the same.
       assert(all_of(MBB.successors(), [&MBB](const MachineBasicBlock *Succ) {
-        return Succ->getCallFrameSize() ==
-               (*MBB.succ_begin())->getCallFrameSize();
+        return Succ->getCallFrameSizeOrZero() ==
+               (*MBB.succ_begin())->getCallFrameSizeOrZero();
       }));
       const MachineBasicBlock &FirstSucc = **MBB.succ_begin();
-      SPAdj = TFI.alignSPAdjust(FirstSucc.getCallFrameSize());
+      SPAdj = TFI.alignSPAdjust(FirstSucc.getCallFrameSizeOrZero());
       if (TFI.getStackGrowthDirection() == TargetFrameLowering::StackGrowsUp)
         SPAdj = -SPAdj;
     }
@@ -1362,8 +1362,8 @@ void PEI::replaceFrameIndicesBackward(MachineFunction &MF) {
     replaceFrameIndicesBackward(&MBB, MF, SPAdj);
 
     // We can't track the call frame size after call frame pseudos have been
-    // eliminated. Set it to zero everywhere to keep MachineVerifier happy.
-    MBB.setCallFrameSize(0);
+    // eliminated. Clear it everywhere to keep MachineVerifier happy.
+    MBB.clearCallFrameSize();
   }
 }
 
@@ -1373,15 +1373,15 @@ void PEI::replaceFrameIndices(MachineFunction &MF) {
   const TargetFrameLowering &TFI = *MF.getSubtarget().getFrameLowering();
 
   for (auto &MBB : MF) {
-    int SPAdj = TFI.alignSPAdjust(MBB.getCallFrameSize());
+    int SPAdj = TFI.alignSPAdjust(MBB.getCallFrameSizeOrZero());
     if (TFI.getStackGrowthDirection() == TargetFrameLowering::StackGrowsUp)
       SPAdj = -SPAdj;
 
     replaceFrameIndices(&MBB, MF, SPAdj);
 
     // We can't track the call frame size after call frame pseudos have been
-    // eliminated. Set it to zero everywhere to keep MachineVerifier happy.
-    MBB.setCallFrameSize(0);
+    // eliminated. Clear it everywhere to keep MachineVerifier happy.
+    MBB.clearCallFrameSize();
   }
 }
 
diff --git a/llvm/lib/CodeGen/TargetInstrInfo.cpp b/llvm/lib/CodeGen/TargetInstrInfo.cpp
index 38bd0b0ba4114c..f623ba0352de8b 100644
--- a/llvm/lib/CodeGen/TargetInstrInfo.cpp
+++ b/llvm/lib/CodeGen/TargetInstrInfo.cpp
@@ -1624,15 +1624,15 @@ TargetInstrInfo::describeLoadedValue(const MachineInstr &MI,
   return std::nullopt;
 }
 
-// Get the call frame size just before MI.
-unsigned TargetInstrInfo::getCallFrameSizeAt(MachineInstr &MI) const {
+std::optional<unsigned>
+TargetInstrInfo::getCallFrameSizeAt(MachineInstr &MI) const {
   // Search backwards from MI for the most recent call frame instruction.
   MachineBasicBlock *MBB = MI.getParent();
   for (auto &AdjI : reverse(make_range(MBB->instr_begin(), MI.getIterator()))) {
     if (AdjI.getOpcode() == getCallFrameSetupOpcode())
       return getFrameTotalSize(AdjI);
     if (AdjI.getOpcode() == getCallFrameDestroyOpcode())
-      return 0;
+      return std::nullopt;
   }
 
   // If none was found, use the call frame size from the start of the basic
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 9096617a948557..e2c57d8b9ac524 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -11594,7 +11594,7 @@ ARMTargetLowering::EmitStructByval(MachineInstr &MI,
   MF->insert(It, exitMBB);
 
   // Set the call frame size on entry to the new basic blocks.
-  unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+  std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
   loopMBB->setCallFrameSize(CallFrameSize);
   exitMBB->setCallFrameSize(CallFrameSize);
 
@@ -12195,7 +12195,7 @@ ARMTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
     F->insert(It, sinkMBB);
 
     // Set the call frame size on entry to the new basic blocks.
-    unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+    std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
     copy0MBB->setCallFrameSize(CallFrameSize);
     sinkMBB->setCallFrameSize(CallFrameSize);
 
diff --git a/llvm/lib/Target/AVR/AVRISelLowering.cpp b/llvm/lib/Target/AVR/AVRISelLowering.cpp
index 0046e757f4efa2..cf6f026869da80 100644
--- a/llvm/lib/Target/AVR/AVRISelLowering.cpp
+++ b/llvm/lib/Target/AVR/AVRISelLowering.cpp
@@ -2424,7 +2424,7 @@ AVRTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
   MF->insert(I, falseMBB);
 
   // Set the call frame size on entry to the new basic blocks.
-  unsigned CallFrameSize = TII.getCallFrameSizeAt(MI);
+  std::optional<unsigned> CallFrameSize = TII.getCallFrameSizeAt(MI);
   trueMBB->setCallFrameSize(CallFrameSize);
   falseMBB->setCallFrameSize(CallFrameSize);
 
diff --git a/llvm/lib/Target/M68k/M68kISelLowering.cpp b/llvm/lib/Target/M68k/M68kISelLowering.cpp
index 316a6eebc2db0f..3b0a0c83d94d16 100644
--- a/llvm/lib/Target/M68k/M68kISelLowering.cpp
+++ b/llvm/lib/Target/M68k/M68kISelLowering.cpp
@@ -3199,7 +3199,7 @@ M68kTargetLowering::EmitLoweredSelect(MachineInstr &MI,
   F->insert(It, SinkMBB);
 
   // Set the call frame size on entry to the new basic blocks.
-  unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+  std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
   Copy0MBB->setCallFrameSize(CallFrameSize);
   SinkMBB->setCallFrameSize(CallFrameSize);
 
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 8a0858e2462520..f56443770095c7 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -13031,7 +13031,7 @@ PPCTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
 
     // Set the call frame size on entry to the new basic blocks.
     // See https://reviews.llvm.org/D156113.
-    unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+    std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
     copy0MBB->setCallFrameSize(CallFrameSize);
     sinkMBB->setCallFrameSize(CallFrameSize);
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d02078372b24a2..7fcc624e834b9e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -18444,7 +18444,8 @@ static MachineBasicBlock *emitSelectPseudo(MachineInstr &MI,
   F->insert(I, TailMBB);
 
   // Set the call frame size on entry to the new basic blocks.
-  unsigned CallFrameSize = TII.getCallFrameSizeAt(*LastSelectPseudo);
+  std::optional<unsigned> CallFrameSize =
+      TII.getCallFrameSizeAt(*LastSelectPseudo);
   IfFalseMBB->setCallFrameSize(CallFrameSize);
   TailMBB->setCallFrameSize(CallFrameSize);
 
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index f011249d295040..bbee0af109c74b 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -35264,7 +35264,7 @@ X86TargetLowering::EmitLoweredSelect(MachineInstr &MI,
   F->insert(It, SinkMBB);
 
   // Set the call frame size on entry to the new basic blocks.
-  unsigned CallFrameSize = TII->getCallFrameSizeAt(MI);
+  std::optional<unsigned> CallFrameSize = TII->getCallFrameSizeAt(MI);
   FalseMBB->setCallFrameSize(CallFrameSize);
   SinkMBB->setCallFrameSize(CallFrameSize);
 
diff --git a/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir b/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir
index d3e87120693cf3..58e7f50e583455 100644
--- a/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir
+++ b/llvm/test/CodeGen/MIR/ARM/call-frame-size.mir
@@ -23,3 +23,25 @@ body: |
     ADJCALLSTACKUP 100, 0, 14, $noreg, implicit-def $sp, implicit $sp
   bb.2:
 ...
+
+---
+name: callframesizezero
+body: |
+  ; CHECK-LABEL: name: callframesizezero
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, 14 /* CC::al */, $noreg, implicit-def $sp, implicit $sp
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1 (call-frame-size 0):
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, 14 /* CC::al */, $noreg, implicit-def $sp, implicit $sp
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  bb.0:
+    ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
+  bb.1 (call-frame-size 0):
+    ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def $sp, implicit $sp
+  bb.2:
+...

Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs

Do you have an example of where this is useful?

@@ -3837,11 +3837,11 @@ void MachineVerifier::verifyStackFrame() {
BBState.ExitIsSetup = BBState.EntryIsSetup;
}

if ((int)MBB->getCallFrameSize() != -BBState.EntryValue) {
if ((int)MBB->getCallFrameSizeOrZero() != -BBState.EntryValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you improve the verification here by making EntryValue/ExitValue be std::optionals too, and using the to track the presence/absence of a call frame?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When implementing this, I noticed cases where MachineBBs are inserted but their call frame size is not set, e.g., in X86FrameLowering and in AMDGPURegisterBankInfo.
The call frames in the tests that fail because of that with my adjusted MachineVerifier (one X86 and one AMDGPU) happen to be zero-sized, so the current implementation does not find an issue.

Do you recall if there was a reason why the call frame sizes for such MachineBBs introduced after ISel are not set in the original patch for storing the call frame size in MachineBasicBlock?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no good reason. It was just an oversight. If no tests failed, then I would not have noticed that there was a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll look for other such cases then and see if I can fix them. Thanks for the insight!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the PR to a version with the adjusted MachineVerifier and fixes for the lit tests that failed with it. However, it's difficult to say whether this captures all cases where a CALLSEQ might be split across MachineBBs. For instance, I found 126 references (in 62 files) to MachineBasicBlock::splice; I think we'd have to handle them (and other cases, e.g., where new MachineBBs are created) or check that they can't affect CALLSEQs to be really sure we caught everything.
An alternative would be to drop the stored call frame size from the MachineBBs and compute up-to-date call frame sizes only when they are needed, i.e., in the PrologueEpilogueInserter and where I use it in #106965.
What's your opinion on that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to drop the stored call frame size from the MachineBBs and compute up-to-date call frame sizes only when they are needed, i.e., in the PrologueEpilogueInserter

The whole point of storing the frame size in MachineBB is so that PEI does not have to do a complete forward walk over the function to compute frame sizes, in addition to the backward walk that it does to eliminate frame indices. My assumption was that the forward walk would be expensive, but if you can do it without angering https://llvm-compile-time-tracker.com/ then go for it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayfoad I just updated this PR to (re)compute the call frame sizes on demand instead of storing them. The compile-time overhead seems to be relatively minor (see the CT report of the entire PR stack here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping, @jayfoad

Copy link
Member Author

Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs

Do you have an example of where this is useful?

Yes, PR #106965 (the second one in this stacked PR) relies on this for identifying locations that are inside any call frame (even zero-sized ones) to avoid nested CALLSEQs when thread-local-storage addresses are passed as a function argument.

Comment on lines 287 to 289
/// Call frame sizes for the MachineFunction's MachineBasicBlocks. This is set
/// by the MachineFrameSizeInfo constructor and cleared by its destructor.
MachineFrameSizeInfo *SizeInfo = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this type of ownership makes sense. MachineFrameInfo should be a compilation wide state as the function is processed. This shouldn't be set for one function in one pass that happens to compute it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for that was to make passing this info to X86TargetLowering::EmitLoweredTLSAddr less obtrusive in #106965. For this PR here, it's certainly not necessary. I can move this change over to the other PR for now and then we can discuss there how else to pass it to the function (or decide on a different way to avoid the nested CALLSEQs for TLSAddrs).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in latest commit.

…chineBB

Call frames can be zero-sized. Distinguishing between instructions in a
zero-sized call frame and instructions outside of call frames via the
call frame size info stored in the MachineBBs is helpful to avoid nested
CALLSEQs (which are illegal and cause machine verifier errors).

This patch uses std::optional<unsigned> instead of unsigned to represent
call frame sizes. Locations outside of call frames are marked with
std::nullopt whereas locations in zero-sized call-frames are represented
with a std::optional that contains 0.
Previously, we would need to update stored call frame sizes whenever a
MachineBB with a call sequence is split or call frame instructions are
moved. By recomputing them instead, this change avoids the need for
keeping track of the call frame sizes throughout transformations. It
also fixes several cases where updates of call frame sizes were missing.

The overhead according to the compile time tracker is < +0.03% executed
instructions on all CT benchmarks.

This also allows distinguishing a zero-sized call frame from no open
call frame, which is helpful to avoid nested CALLSEQs (which are illegal
and cause machine verifier errors).
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/track-zero-sized-call-frames-in-machinebb branch from 20aa681 to 7020475 Compare October 10, 2024 10:15
@ritter-x2a ritter-x2a changed the title [CodeGen] Distinguish zero-sized call frames from no call frame in MachineBB [CodeGen] Compute call frame sizes instead of storing them in MachineBBs Oct 10, 2024
@ritter-x2a
Copy link
Member Author

Ping. I pulled in updates from main and updated the description and title of this PR, since it did have a bit of a shift in subject. Let me know if you'd prefer me to open a new PR for computing instead of storing call frame sizes instead.

@ritter-x2a
Copy link
Member Author

With PR #106965 now made obsolete by the merged #113706, this PR is no longer necessary to achieve the goal I was after. (Re)-computing instead of storing the call frame sizes (which this PR does) might however still be worthwhile to avoid errors caused by missed updates of call frame sizes.
Let me know if there is interest in still pursuing this PR, otherwise I will close it.

@jayfoad
Copy link
Contributor

jayfoad commented Nov 4, 2024

I'm ambivalent. Recomputing frame sizes instead of storing them seems like a cleaner approach, but it also seems like wasted effort in PEI to do a complete walk of the CFG just to recompute them.

@ritter-x2a
Copy link
Member Author

I'm ambivalent. Recomputing frame sizes instead of storing them seems like a cleaner approach, but it also seems like wasted effort in PEI to do a complete walk of the CFG just to recompute them.

Then let's shelve this PR for now.

@ritter-x2a ritter-x2a closed this Nov 11, 2024
@ritter-x2a ritter-x2a deleted the users/ritter-x2a/track-zero-sized-call-frames-in-machinebb branch November 26, 2024 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants