Skip to content

[CFIFixup] Fixup CFI for split functions with synchronous uwtables #125299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 11, 2025

Conversation

dhoekwater
Copy link
Contributor

@dhoekwater dhoekwater commented Jan 31, 2025

  • Precommit tests for synchronous uwtable CFI fixup
  • [CFIFixup] Fixup CFI for split functions with synchronous uwtables

Commit 6e54fcc disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable block-level CFI fixup for functions with
synchronous tables.

Unwind tables can be:

  • N/A (not present)
  • Asynchronous
  • Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate for
each function (specifically, the portion of a function in a given
section). Disabling CFI fixup entirely for functions with synchronous
uwtables may break CFI for a function split between two sections. The
portion in the first section may have valid CFI, while the portion in
the second section is missing a call frame.

Ex:

(.text.hot)
Foo (BB1):
  <Call frame information>
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  <epilogue>

Even if Foo has a synchronous unwind table, we still need to insert
call frame information into BB3 so that unwinding the call stack from
BB3 or BB4 works properly.

@dhoekwater dhoekwater changed the title cfi fixup [CFIFixup] Fixup CFI for split functions with synchronous uwtables Feb 7, 2025
Copy link

github-actions bot commented Feb 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@dhoekwater dhoekwater force-pushed the cfi-fixup branch 2 times, most recently from 9892c6f to d109c6c Compare February 7, 2025 19:35
@dhoekwater dhoekwater marked this pull request as ready for review February 7, 2025 19:59
@llvmbot
Copy link
Member

llvmbot commented Feb 7, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Daniel Hoekwater (dhoekwater)

Changes
  • Precommit tests for synchronous uwtable CFI fixup
  • [CFIFixup] Fixup CFI for split functions with synchronous uwtables

Commit 6e54fcc disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable block-level CFI fixup for functions with
synchronous tables.

Unwind tables can be:

  • N/A (not present)
  • Asynchronous
  • Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate for
each function (specifically, the portion of a function in a given
section). Disabling CFI fixup entirely for functions with synchronous
uwtables may break CFI for a function split between two sections. The
portion in the first section may have valid CFI, while the portion in
the second section is missing a call frame.

Ex:

(.text.hot)
Foo (BB1):
  &lt;Call frame information&gt;
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  &lt;epilogue&gt;

Even if Foo has a synchronous unwind table, we still need to insert
call frame information into BB3 so that unwinding the call stack from
BB3 or BB4 works properly.


Full diff: https://github.com/llvm/llvm-project/pull/125299.diff

6 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetFrameLowering.h (+8-2)
  • (modified) llvm/lib/CodeGen/CFIFixup.cpp (+6)
  • (modified) llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+7-1)
  • (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.h (+3-1)
  • (modified) llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir (+187)
diff --git a/llvm/include/llvm/CodeGen/TargetFrameLowering.h b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
index 97de0197da9b400..09c19cce4e57865 100644
--- a/llvm/include/llvm/CodeGen/TargetFrameLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
@@ -229,10 +229,16 @@ class TargetFrameLowering {
 
   /// Returns true if we may need to fix the unwind information for the
   /// function.
-  virtual bool enableCFIFixup(MachineFunction &MF) const;
+  virtual bool enableCFIFixup(const MachineFunction &MF) const;
+
+  /// enableBlockLevelCFIFixup - Returns true if we may need to fix up the
+  /// function at a basic block level (e.g. for async unwind tables).
+  virtual bool enableBlockLevelCFIFixup(const MachineFunction &MF) const {
+    return enableCFIFixup(MF);
+  };
 
   /// Emit CFI instructions that recreate the state of the unwind information
-  /// upon fucntion entry.
+  /// upon function entry.
   virtual void resetCFIToInitialState(MachineBasicBlock &MBB) const {}
 
   /// Replace a StackProbe stub (if any) with the actual probe code inline
diff --git a/llvm/lib/CodeGen/CFIFixup.cpp b/llvm/lib/CodeGen/CFIFixup.cpp
index 7778aa637ff08da..d55e477ecf77be0 100644
--- a/llvm/lib/CodeGen/CFIFixup.cpp
+++ b/llvm/lib/CodeGen/CFIFixup.cpp
@@ -80,6 +80,7 @@
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCDwarf.h"
+#include "llvm/Support/CodeGen.h"
 #include "llvm/Target/TargetMachine.h"
 
 #include <iterator>
@@ -249,6 +250,11 @@ fixupBlock(MachineBasicBlock &CurrBB, const BlockFlagsVector &BlockInfo,
   if (!Info.Reachable)
     return false;
 
+  // If we don't need to fix up CFI at the block level, we only need to fix up
+  // the first basic block in the section.
+  if (!TFL.enableBlockLevelCFIFixup(MF) && !CurrBB.isBeginSection())
+    return false;
+
   // If the previous block and the current block are in the same section,
   // the frame info will propagate from the previous block to the current one.
   const BlockFlags &PrevInfo =
diff --git a/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp b/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
index 08e86c705786c74..5784974cd8ed985 100644
--- a/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
+++ b/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
@@ -35,7 +35,7 @@ bool TargetFrameLowering::enableCalleeSaveSkip(const MachineFunction &MF) const
   return false;
 }
 
-bool TargetFrameLowering::enableCFIFixup(MachineFunction &MF) const {
+bool TargetFrameLowering::enableCFIFixup(const MachineFunction &MF) const {
   return MF.needsFrameMoves() &&
          !MF.getTarget().getMCAsmInfo()->usesWindowsCFI();
 }
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index a082a1ebe95bf84..968fcc253e00eed 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -2609,8 +2609,14 @@ void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,
   }
 }
 
-bool AArch64FrameLowering::enableCFIFixup(MachineFunction &MF) const {
+bool AArch64FrameLowering::enableCFIFixup(const MachineFunction &MF) const {
   return TargetFrameLowering::enableCFIFixup(MF) &&
+         MF.getInfo<AArch64FunctionInfo>()->needsDwarfUnwindInfo(MF);
+}
+
+bool AArch64FrameLowering::enableBlockLevelCFIFixup(
+    const MachineFunction &MF) const {
+  return enableCFIFixup(MF) &&
          MF.getInfo<AArch64FunctionInfo>()->needsAsyncDwarfUnwindInfo(MF);
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.h b/llvm/lib/Target/AArch64/AArch64FrameLowering.h
index 8f84702f4d2baff..82b214c7a35f63a 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.h
@@ -36,7 +36,9 @@ class AArch64FrameLowering : public TargetFrameLowering {
   void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
   void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
 
-  bool enableCFIFixup(MachineFunction &MF) const override;
+  bool enableCFIFixup(const MachineFunction &MF) const override;
+
+  bool enableBlockLevelCFIFixup(const MachineFunction &MF) const override;
 
   bool canUseAsPrologue(const MachineBasicBlock &MBB) const override;
 
diff --git a/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir b/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir
index a24972d1388320f..a4c88be375c0148 100644
--- a/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir
+++ b/llvm/test/CodeGen/AArch64/cfi-fixup-multi-section.mir
@@ -10,9 +10,19 @@
     ret i32 0
   }
 
+  define i32 @f1(i32 %x) #1 {
+  entry: br label %return
+  if.end: br label %return
+  if.then2: br label %return
+  if.else: br label %return
+  return:
+    ret i32 0
+  }
+
   declare i32 @g(i32)
 
   attributes #0 = { nounwind shadowcallstack uwtable "sign-return-address"="non-leaf" "target-features"="+reserve-x18" }
+  attributes #1 = { nounwind shadowcallstack uwtable(sync) "sign-return-address"="non-leaf" "target-features"="+reserve-x18" }
 
 ...
 ---
@@ -197,4 +207,181 @@ body:             |
     B %bb.7
 
 
+...
+---
+name:            f1
+alignment:       4
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+failsVerification: false
+registers:       []
+liveins:
+  - { reg: '$w0', virtual-reg: '' }
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       16
+  offsetAdjustment: 0
+  maxAlignment:    16
+  adjustsStack:    true
+  hasCalls:        true
+  stackProtector:  ''
+  maxCallFrameSize: 0
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     false
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: -16, size: 8, alignment: 16,
+      stack-id: default, callee-saved-register: '$lr', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+body:             |
+  ; CHECK-LABEL: name: f1
+  ; CHECK: bb.0.entry:
+  ; CHECK-NEXT:   successors: %bb.4(0x30000000), %bb.1(0x50000000)
+  ; CHECK-NEXT:   liveins: $w0, $lr, $x18
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   CBZW renamable $w0, %bb.4
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1.if.end:
+  ; CHECK-NEXT:   successors: %bb.3(0x30000000), %bb.2(0x50000000)
+  ; CHECK-NEXT:   liveins: $w0, $lr, $x18
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   early-clobber $x18 = frame-setup STRXpost $lr, $x18, 8
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION escape 0x16, 0x12, 0x02, 0x82, 0x78
+  ; CHECK-NEXT:   frame-setup PACIASP implicit-def $lr, implicit killed $lr, implicit $sp
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION negate_ra_sign_state
+  ; CHECK-NEXT:   early-clobber $sp = frame-setup STRXpre killed $lr, $sp, -16 :: (store (s64) into %stack.0)
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION def_cfa_offset 16
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION offset $w30, -16
+  ; CHECK-NEXT:   TBNZW renamable $w0, 31, %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.if.else:
+  ; CHECK-NEXT:   successors: %bb.5(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   renamable $w0 = nuw nsw ADDWri killed renamable $w0, 1, 0
+  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+  ; CHECK-NEXT:   renamable $w8 = MOVZWi 1, 0
+  ; CHECK-NEXT:   $w0 = SUBWrs killed renamable $w8, killed renamable $w0, 0
+  ; CHECK-NEXT:   B %bb.5
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3.if.then2 (bbsections 1):
+  ; CHECK-NEXT:   successors: %bb.5(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION escape 0x16, 0x12, 0x02, 0x82, 0x78
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION negate_ra_sign_state
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION def_cfa_offset 16
+  ; CHECK-NEXT:   frame-setup CFI_INSTRUCTION offset $w30, -16
+  ; CHECK-NEXT:   renamable $w0 = nsw SUBWri killed renamable $w0, 1, 0
+  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+  ; CHECK-NEXT:   renamable $w0 = nsw ADDWri killed renamable $w0, 1, 0
+  ; CHECK-NEXT:   B %bb.5
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.4.return:
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   RET undef $lr, implicit killed $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.5.return:
+  ; CHECK-NEXT:   successors: %bb.7(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   B %bb.7
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.6.return:
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.0)
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION def_cfa_offset 0
+  ; CHECK-NEXT:   frame-destroy AUTIASP implicit-def $lr, implicit killed $lr, implicit $sp
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION negate_ra_sign_state
+  ; CHECK-NEXT:   early-clobber $x18, $lr = frame-destroy LDRXpre $x18, -8
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION restore $w18
+  ; CHECK-NEXT:   frame-destroy CFI_INSTRUCTION restore $w30
+  ; CHECK-NEXT:   RET undef $lr, implicit killed $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.7.return:
+  ; CHECK-NEXT:   successors: %bb.6(0x80000000)
+  ; CHECK-NEXT:   liveins: $w0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   B %bb.6
+  bb.0.entry:
+    successors: %bb.4(0x30000000), %bb.1(0x50000000)
+    liveins: $w0, $lr, $x18
+
+    CBZW renamable $w0, %bb.4
+
+  bb.1.if.end:
+    successors: %bb.3(0x30000000), %bb.2(0x50000000)
+    liveins: $w0, $lr, $x18
+
+    early-clobber $x18 = frame-setup STRXpost $lr, $x18, 8
+    frame-setup CFI_INSTRUCTION escape 0x16, 0x12, 0x02, 0x82, 0x78
+    frame-setup PACIASP implicit-def $lr, implicit killed $lr, implicit $sp
+    frame-setup CFI_INSTRUCTION negate_ra_sign_state
+    early-clobber $sp = frame-setup STRXpre killed $lr, $sp, -16 :: (store (s64) into %stack.0)
+    frame-setup CFI_INSTRUCTION def_cfa_offset 16
+    frame-setup CFI_INSTRUCTION offset $w30, -16
+    TBNZW renamable $w0, 31, %bb.3
+
+  bb.2.if.else:
+    successors: %bb.5(0x80000000)
+    liveins: $w0
+
+    renamable $w0 = nuw nsw ADDWri killed renamable $w0, 1, 0
+    BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+    renamable $w8 = MOVZWi 1, 0
+    $w0 = SUBWrs killed renamable $w8, killed renamable $w0, 0
+    B %bb.5
+
+  bb.3.if.then2 (bbsections 1):
+    successors: %bb.5(0x80000000)
+    liveins: $w0
+
+    renamable $w0 = nsw SUBWri killed renamable $w0, 1, 0
+    BL @g, csr_aarch64_aapcs_scs, implicit-def dead $lr, implicit $sp, implicit killed $w0, implicit-def $sp, implicit-def $w0
+    renamable $w0 = nsw ADDWri killed renamable $w0, 1, 0
+    B %bb.5
+
+  bb.4.return:
+    liveins: $w0
+    RET undef $lr, implicit killed $w0
+
+  bb.5.return:
+    liveins: $w0
+    B %bb.6
+
+  bb.7.return:
+    liveins: $w0
+    early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.0)
+    frame-destroy CFI_INSTRUCTION def_cfa_offset 0
+    frame-destroy AUTIASP implicit-def $lr, implicit killed $lr, implicit $sp
+    frame-destroy CFI_INSTRUCTION negate_ra_sign_state
+    early-clobber $x18, $lr = frame-destroy LDRXpre $x18, -8
+    frame-destroy CFI_INSTRUCTION restore $w18
+    frame-destroy CFI_INSTRUCTION restore $w30
+    RET undef $lr, implicit killed $w0
+
+  bb.6.return:
+    liveins: $w0
+    B %bb.7
+
+
 ...

@momchil-velikov
Copy link
Collaborator

May I suggest a few changes in wording:

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with async unwind tables needs to be accurate at each individual instruction.

Functions with synchronous unwind tables only need to be accurate for
each function

Functions with sync unwind tables needs to be accurate only at specific points where they may (synchronously)
trigger unwinding. In practical terms, that relaxes the requirements towards the sync case by not requiring instruction
accurate unwind tables during prologue and not requiring epilogue unwind instructions at all.

As you mentioned

so full CFI fixup is necessary.

I would suggest renaming enableBlockLevelCFIFixup to enableFull[Function]CFIFixup.

@dhoekwater
Copy link
Contributor Author

May I suggest a few changes in wording:

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with async unwind tables needs to be accurate at each individual instruction.

Functions with synchronous unwind tables only need to be accurate for
each function

Functions with sync unwind tables needs to be accurate only at specific points where they may (synchronously) trigger unwinding. In practical terms, that relaxes the requirements towards the sync case by not requiring instruction accurate unwind tables during prologue and not requiring epilogue unwind instructions at all.

As you mentioned

so full CFI fixup is necessary.

I would suggest renaming enableBlockLevelCFIFixup to enableFull[Function]CFIFixup.

Done!

Commit 6e54fcc disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable *block-level* CFI fixup for functions with
synchronous tables.

Unwind tables can be:
- N/A (not present)
- Asynchronous
- Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables need to be accurate for each
individual instruction, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate at
specific points where they may trigger unwinding. This property relaxes
the requirements for these functions, and
6e54fcc takes advantage of the relaxed
requirements to emit fewer CFI instructions and skip CFI fixup in this
case. However, disabling CFI fixup entirely for functions with
synchronous uwtables may break CFI for a function split between two
sections. The portion in the first section may have valid CFI, while the
portion in the second section is missing a call frame.

Ex:
```
(.text.hot)
Foo (BB1):
  <Call frame information>
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  <epilogue>
```

Even if `Foo` has a synchronous unwind table, we still need to insert
call frame information into `BB3` so that unwinding the call stack from
`BB3` or `BB4` works properly.
@dhoekwater dhoekwater merged commit 3a22cf9 into llvm:main Feb 11, 2025
8 checks passed
@dhoekwater dhoekwater deleted the cfi-fixup branch February 11, 2025 23:25
flovent pushed a commit to flovent/llvm-project that referenced this pull request Feb 13, 2025
…lvm#125299)

- **Precommit tests for synchronous uwtable CFI fixup**
- **[CFIFixup] Fixup CFI for split functions with synchronous uwtables**

Commit
llvm@6e54fcc
disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable *block-level* CFI fixup for functions with
synchronous tables.

Unwind tables can be:
- N/A (not present)
- Asynchronous
- Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate for
each function (specifically, the portion of a function in a given
section). Disabling CFI fixup entirely for functions with synchronous
uwtables may break CFI for a function split between two sections. The
portion in the first section may have valid CFI, while the portion in
the second section is missing a call frame.

Ex:
```
(.text.hot)
Foo (BB1):
  <Call frame information>
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  <epilogue>
```

Even if `Foo` has a synchronous unwind table, we still need to insert
call frame information into `BB3` so that unwinding the call stack from
`BB3` or `BB4` works properly.
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
…lvm#125299)

- **Precommit tests for synchronous uwtable CFI fixup**
- **[CFIFixup] Fixup CFI for split functions with synchronous uwtables**

Commit
llvm@6e54fcc
disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable *block-level* CFI fixup for functions with
synchronous tables.

Unwind tables can be:
- N/A (not present)
- Asynchronous
- Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate for
each function (specifically, the portion of a function in a given
section). Disabling CFI fixup entirely for functions with synchronous
uwtables may break CFI for a function split between two sections. The
portion in the first section may have valid CFI, while the portion in
the second section is missing a call frame.

Ex:
```
(.text.hot)
Foo (BB1):
  <Call frame information>
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  <epilogue>
```

Even if `Foo` has a synchronous unwind table, we still need to insert
call frame information into `BB3` so that unwinding the call stack from
`BB3` or `BB4` works properly.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…lvm#125299)

- **Precommit tests for synchronous uwtable CFI fixup**
- **[CFIFixup] Fixup CFI for split functions with synchronous uwtables**

Commit
llvm@6e54fcc
disables CFI fixup for
functions with synchronous tables, breaking CFI for split functions.
Instead, we can disable *block-level* CFI fixup for functions with
synchronous tables.

Unwind tables can be:
- N/A (not present)
- Asynchronous
- Synchronous

Functions without unwind tables don't need CFI fixup (since they don't
care about CFI).

Functions with asynchronous unwind tables must be accurate for each
basic block, so full CFI fixup is necessary.

Functions with synchronous unwind tables only need to be accurate for
each function (specifically, the portion of a function in a given
section). Disabling CFI fixup entirely for functions with synchronous
uwtables may break CFI for a function split between two sections. The
portion in the first section may have valid CFI, while the portion in
the second section is missing a call frame.

Ex:
```
(.text.hot)
Foo (BB1):
  <Call frame information>
  ...
BB2:
  ...

(.text.split)
BB3:
  ...
BB4:
  <epilogue>
```

Even if `Foo` has a synchronous unwind table, we still need to insert
call frame information into `BB3` so that unwinding the call stack from
`BB3` or `BB4` works properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants