[BOLT] Offset LPStart to avoid unnecessary instructions #116713

maksfb · 2024-11-18T23:29:23Z

For C++ exception handling, when we write a call site table, we must avoid emitting 0-value offsets for landing pads unless the call site has no landing pad. However, 0 can be a real offset from the start of the FDE if the FDE corresponds to a function fragment that starts with a landing pad. In such cases, we used to emit a trap instruction at the start of the fragment to guarantee non-zero LP offset.

To avoid emitting unnecessary trap instructions, we can instead set LPStart to an offset from the FDE. If we emit it as [FDEStart - 1], then all real offsets from LPStart in FDE become non-negative.

For C++ exception handling, when we write a call site table, we must avoid emitting 0-value offsets for landing pads unless the call site has no landing pad. However, 0 can be a real offset from the start of the FDE if the FDE corresponds to a function fragment that starts with a landing pad. In such cases, we used to emit a trap instruction at the start of the fragment to guarantee non-zero LP offset. To avoid emitting unnecessary trap instructions, we can instead set LPStart to an offset from the FDE. If we emit it as [FDEStart - 1], then all real offsets from LPStart in FDE become non-negative.

llvmbot · 2024-11-18T23:29:44Z

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

For C++ exception handling, when we write a call site table, we must avoid emitting 0-value offsets for landing pads unless the call site has no landing pad. However, 0 can be a real offset from the start of the FDE if the FDE corresponds to a function fragment that starts with a landing pad. In such cases, we used to emit a trap instruction at the start of the fragment to guarantee non-zero LP offset.

To avoid emitting unnecessary trap instructions, we can instead set LPStart to an offset from the FDE. If we emit it as [FDEStart - 1], then all real offsets from LPStart in FDE become non-negative.

Full diff: https://github.com/llvm/llvm-project/pull/116713.diff

1 Files Affected:

(modified) bolt/lib/Core/BinaryEmitter.cpp (+37-33)

diff --git a/bolt/lib/Core/BinaryEmitter.cpp b/bolt/lib/Core/BinaryEmitter.cpp
index bdf784ec7b6f3e..08590d3995342d 100644
--- a/bolt/lib/Core/BinaryEmitter.cpp
+++ b/bolt/lib/Core/BinaryEmitter.cpp
@@ -416,17 +416,6 @@ void BinaryEmitter::emitFunctionBody(BinaryFunction &BF, FunctionFragment &FF,
     BF.duplicateConstantIslands();
   }
 
-  if (!FF.empty() && FF.front()->isLandingPad()) {
-    assert(!FF.front()->isEntryPoint() &&
-           "Landing pad cannot be entry point of function");
-    // If the first block of the fragment is a landing pad, it's offset from the
-    // start of the area that the corresponding LSDA describes is zero. In this
-    // case, the call site entries in that LSDA have 0 as offset to the landing
-    // pad, which the runtime interprets as "no handler". To prevent this,
-    // insert some padding.
-    Streamer.emitBytes(BC.MIB->getTrapFillValue());
-  }
-
   // Track the first emitted instruction with debug info.
   bool FirstInstr = true;
   for (BinaryBasicBlock *const BB : FF) {
@@ -926,39 +915,54 @@ void BinaryEmitter::emitLSDA(BinaryFunction &BF, const FunctionFragment &FF) {
   // Emit the LSDA header.
 
   // If LPStart is omitted, then the start of the FDE is used as a base for
-  // landing pad displacements. Then if a cold fragment starts with
-  // a landing pad, this means that the first landing pad offset will be 0.
-  // As a result, the exception handling runtime will ignore this landing pad
-  // because zero offset denotes the absence of a landing pad.
-  // For this reason, when the binary has fixed starting address we emit LPStart
-  // as 0 and output the absolute value of the landing pad in the table.
+  // landing pad displacements. Then, if a cold fragment starts with a landing
+  // pad, this means that the first landing pad offset will be 0. However, C++
+  // runtime treats 0 as if there is no landing pad present, thus we *must* emit
+  // non-zero offsets for all valid LPs.
   //
-  // If the base address can change, we cannot use absolute addresses for
-  // landing pads (at least not without runtime relocations). Hence, we fall
-  // back to emitting landing pads relative to the FDE start.
-  // As we are emitting label differences, we have to guarantee both labels are
-  // defined in the same section and hence cannot place the landing pad into a
-  // cold fragment when the corresponding call site is in the hot fragment.
-  // Because of this issue and the previously described issue of possible
-  // zero-offset landing pad we have to place landing pads in the same section
-  // as the corresponding invokes for shared objects.
+  // As a solution, for fixed-address binaries we set LPStart to 0, and for
+  // position-independent binaries we set LP start to FDE start minus one byte
+  // for FDEs that start with a landing pad.
   std::function<void(const MCSymbol *)> emitLandingPad;
   if (BC.HasFixedLoadAddress) {
     Streamer.emitIntValue(dwarf::DW_EH_PE_udata4, 1); // LPStart format
     Streamer.emitIntValue(0, 4);                      // LPStart
     emitLandingPad = [&](const MCSymbol *LPSymbol) {
-      if (!LPSymbol)
-        Streamer.emitIntValue(0, 4);
-      else
+      if (LPSymbol)
         Streamer.emitSymbolValue(LPSymbol, 4);
+      else
+        Streamer.emitIntValue(0, 4);
     };
   } else {
-    Streamer.emitIntValue(dwarf::DW_EH_PE_omit, 1); // LPStart format
+    const bool NeedsLPAdjustment = !FF.empty() && FF.front()->isLandingPad();
+    if (NeedsLPAdjustment) {
+      // Use relative LPStart format and emit LPStart as [SymbolStart - 1].
+      Streamer.emitIntValue(dwarf::DW_EH_PE_pcrel | dwarf::DW_EH_PE_sdata4, 1);
+      MCSymbol *DotSymbol = BC.Ctx->createTempSymbol("LPBase");
+      Streamer.emitLabel(DotSymbol);
+
+      const MCExpr *LPStartExpr = MCBinaryExpr::createSub(
+          MCSymbolRefExpr::create(StartSymbol, *BC.Ctx),
+          MCSymbolRefExpr::create(DotSymbol, *BC.Ctx), *BC.Ctx);
+      LPStartExpr = MCBinaryExpr::createSub(
+          LPStartExpr, MCConstantExpr::create(1, *BC.Ctx), *BC.Ctx);
+      Streamer.emitValue(LPStartExpr, 4);
+    } else {
+      // DW_EH_PE_omit means FDE start (StartSymbol) will be used as LPStart.
+      Streamer.emitIntValue(dwarf::DW_EH_PE_omit, 1);
+    }
     emitLandingPad = [&](const MCSymbol *LPSymbol) {
-      if (!LPSymbol)
+      if (LPSymbol) {
+        const MCExpr *LPOffsetExpr = MCBinaryExpr::createSub(
+            MCSymbolRefExpr::create(LPSymbol, *BC.Ctx),
+            MCSymbolRefExpr::create(StartSymbol, *BC.Ctx), *BC.Ctx);
+        if (NeedsLPAdjustment)
+          LPOffsetExpr = MCBinaryExpr::createAdd(
+              LPOffsetExpr, MCConstantExpr::create(1, *BC.Ctx), *BC.Ctx);
+        Streamer.emitValue(LPOffsetExpr, 4);
+      } else {
         Streamer.emitIntValue(0, 4);
-      else
-        Streamer.emitAbsoluteSymbolDiff(LPSymbol, StartSymbol, 4);
+      }
     };
   }

dcci

Thanks for working on this Maksim. Can we write a test?

maksfb · 2024-11-18T23:42:51Z

Thanks for working on this Maksim. Can we write a test?

The correctness of this change is tested by several EH runtime tests that we already have. Sadly, I'm not aware of any current LLVM tools that can dump EH tables to test the output ELF. Or maybe you have something else in mind?

dcci

if it's covered already, LG

maksfb added the BOLT label Nov 18, 2024

maksfb requested review from aaupov, rafaelauler, ayermolo and dcci as code owners November 18, 2024 23:29

dcci reviewed Nov 18, 2024

View reviewed changes

dcci approved these changes Nov 19, 2024

View reviewed changes

fixup! [BOLT] Offset LPStart to avoid unnecessary instructions

a9cc2af

maksfb merged commit 066dd91 into llvm:main Nov 20, 2024
5 of 6 checks passed

maksfb deleted the gh-lpstart-adjustment branch March 6, 2025 02:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BOLT] Offset LPStart to avoid unnecessary instructions #116713

[BOLT] Offset LPStart to avoid unnecessary instructions #116713

Uh oh!

maksfb commented Nov 18, 2024

Uh oh!

llvmbot commented Nov 18, 2024

Uh oh!

dcci left a comment

Uh oh!

maksfb commented Nov 18, 2024

Uh oh!

dcci left a comment

Uh oh!

Uh oh!

Uh oh!

[BOLT] Offset LPStart to avoid unnecessary instructions #116713

[BOLT] Offset LPStart to avoid unnecessary instructions #116713

Uh oh!

Conversation

maksfb commented Nov 18, 2024

Uh oh!

llvmbot commented Nov 18, 2024

Uh oh!

dcci left a comment

Choose a reason for hiding this comment

Uh oh!

maksfb commented Nov 18, 2024

Uh oh!

dcci left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!