Skip to content

[BOLT] Overwrite .eh_frame and .gcc_except_table #116755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 19, 2024

Conversation

maksfb
Copy link
Contributor

@maksfb maksfb commented Nov 19, 2024

Under --use-old-text or --strict, we completely rewrite contents of EH frames and exception tables sections. If new contents of either section do not exceed the size of the original section, rewrite the section in-place.

Under --use-old-text or --strict, we completely rewrite contents of EH
frames and exception tables sections. If new contents of either section
do not exceed the size of the original section, rewrite the section
in-place.
@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

Under --use-old-text or --strict, we completely rewrite contents of EH frames and exception tables sections. If new contents of either section do not exceed the size of the original section, rewrite the section in-place.


Full diff: https://github.com/llvm/llvm-project/pull/116755.diff

4 Files Affected:

  • (modified) bolt/include/bolt/Core/BinarySection.h (+9)
  • (modified) bolt/lib/Core/BinarySection.cpp (+9)
  • (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+44-11)
  • (added) bolt/test/eh-frame-overwrite.test (+8)
diff --git a/bolt/include/bolt/Core/BinarySection.h b/bolt/include/bolt/Core/BinarySection.h
index d362961176b326..1093f6ad78a990 100644
--- a/bolt/include/bolt/Core/BinarySection.h
+++ b/bolt/include/bolt/Core/BinarySection.h
@@ -87,6 +87,7 @@ class BinarySection {
                                    // been renamed)
   uint64_t OutputAddress{0};       // Section address for the rewritten binary.
   uint64_t OutputSize{0};          // Section size in the rewritten binary.
+                                   // Can exceed OutputContents with padding.
   uint64_t OutputFileOffset{0};    // File offset in the rewritten binary file.
   StringRef OutputContents;        // Rewritten section contents.
   const uint64_t SectionNumber;    // Order in which the section was created.
@@ -474,6 +475,11 @@ class BinarySection {
   /// Use name \p SectionName for the section during the emission.
   void emitAsData(MCStreamer &Streamer, const Twine &SectionName) const;
 
+  /// Write finalized contents of the section. If OutputSize exceeds the size of
+  /// the OutputContents, append zero padding to the stream and return the
+  /// number of byte written which should match the OutputSize.
+  uint64_t write(raw_ostream &OS) const;
+
   using SymbolResolverFuncTy = llvm::function_ref<uint64_t(const MCSymbol *)>;
 
   /// Flush all pending relocations to patch original contents of sections
@@ -497,6 +503,9 @@ class BinarySection {
     IsFinalized = true;
   }
 
+  /// When writing section contents, add \p PaddingSize zero bytes at the end.
+  void addPadding(uint64_t PaddingSize) { OutputSize += PaddingSize; }
+
   /// Reorder the contents of this section according to /p Order.  If
   /// /p Inplace is true, the entire contents of the section is reordered,
   /// otherwise the new contents contain only the reordered data.
diff --git a/bolt/lib/Core/BinarySection.cpp b/bolt/lib/Core/BinarySection.cpp
index 9ad49ca1b3a038..b16e0a4333aa2d 100644
--- a/bolt/lib/Core/BinarySection.cpp
+++ b/bolt/lib/Core/BinarySection.cpp
@@ -142,6 +142,15 @@ void BinarySection::emitAsData(MCStreamer &Streamer,
     Streamer.emitLabel(BC.Ctx->getOrCreateSymbol("__hot_data_end"));
 }
 
+uint64_t BinarySection::write(raw_ostream &OS) const {
+  const uint64_t NumValidContentBytes =
+      std::min<uint64_t>(getOutputContents().size(), getOutputSize());
+  OS.write(getOutputContents().data(), NumValidContentBytes);
+  if (getOutputSize() > NumValidContentBytes)
+    OS.write_zeros(getOutputSize() - NumValidContentBytes);
+  return getOutputSize();
+}
+
 void BinarySection::flushPendingRelocations(raw_pwrite_stream &OS,
                                             SymbolResolverFuncTy Resolver) {
   if (PendingRelocations.empty() && Patches.empty())
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 40769944e3876b..cac25815071c59 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -3887,6 +3887,43 @@ void RewriteInstance::mapCodeSections(BOLTLinker::SectionMapper MapSection) {
 
 void RewriteInstance::mapAllocatableSections(
     BOLTLinker::SectionMapper MapSection) {
+
+  if (opts::UseOldText || opts::StrictMode) {
+    auto tryRewriteSection = [&](BinarySection &OldSection,
+                                 BinarySection &NewSection) {
+      if (OldSection.getSize() < NewSection.getOutputSize())
+        return;
+
+      BC->outs() << "BOLT-INFO: rewriting " << OldSection.getName()
+                 << " in-place\n";
+
+      NewSection.setOutputAddress(OldSection.getAddress());
+      NewSection.setOutputFileOffset(OldSection.getInputFileOffset());
+      MapSection(NewSection, OldSection.getAddress());
+
+      // Pad contents with zeros.
+      NewSection.addPadding(OldSection.getSize() - NewSection.getOutputSize());
+
+      // Prevent the original section name from appearing in the section header
+      // table.
+      OldSection.setAnonymous(true);
+    };
+
+    if (EHFrameSection) {
+      BinarySection *NewEHFrameSection =
+          getSection(getNewSecPrefix() + getEHFrameSectionName());
+      assert(NewEHFrameSection && "New contents expected for .eh_frame");
+      tryRewriteSection(*EHFrameSection, *NewEHFrameSection);
+    }
+    BinarySection *EHSection = getSection(".gcc_except_table");
+    BinarySection *NewEHSection =
+        getSection(getNewSecPrefix() + ".gcc_except_table");
+    if (EHSection) {
+      assert(NewEHSection && "New contents expected for .gcc_except_table");
+      tryRewriteSection(*EHSection, *NewEHSection);
+    }
+  }
+
   // Allocate read-only sections first, then writable sections.
   enum : uint8_t { ST_READONLY, ST_READWRITE };
   for (uint8_t SType = ST_READONLY; SType <= ST_READWRITE; ++SType) {
@@ -4164,7 +4201,6 @@ void RewriteInstance::rewriteNoteSections() {
     // New section size.
     uint64_t Size = 0;
     bool DataWritten = false;
-    uint8_t *SectionData = nullptr;
     // Copy over section contents unless it's one of the sections we overwrite.
     if (!willOverwriteSection(SectionName)) {
       Size = Section.sh_size;
@@ -4196,12 +4232,7 @@ void RewriteInstance::rewriteNoteSections() {
     if (BSec->getAllocAddress()) {
       assert(!DataWritten && "Writing section twice.");
       (void)DataWritten;
-      SectionData = BSec->getOutputData();
-
-      LLVM_DEBUG(dbgs() << "BOLT-DEBUG: " << (Size ? "appending" : "writing")
-                        << " contents to section " << SectionName << '\n');
-      OS.write(reinterpret_cast<char *>(SectionData), BSec->getOutputSize());
-      Size += BSec->getOutputSize();
+      Size += BSec->write(OS);
     }
 
     BSec->setOutputFileOffset(NextAvailableOffset);
@@ -4232,8 +4263,7 @@ void RewriteInstance::rewriteNoteSections() {
                << " of size " << Section.getOutputSize() << " at offset 0x"
                << Twine::utohexstr(Section.getOutputFileOffset()) << '\n');
 
-    OS.write(Section.getOutputContents().data(), Section.getOutputSize());
-    NextAvailableOffset += Section.getOutputSize();
+    NextAvailableOffset += Section.write(OS);
   }
 }
 
@@ -4347,6 +4377,9 @@ RewriteInstance::getOutputSections(ELFObjectFile<ELFT> *File,
     BinarySection *BinSec = BC->getSectionForSectionRef(SecRef);
     assert(BinSec && "Matching BinarySection should exist.");
 
+    if (BinSec->isAnonymous())
+      continue;
+
     addSection(Section, *BinSec);
   }
 
@@ -5699,8 +5732,8 @@ void RewriteInstance::rewriteFile() {
                  << Twine::utohexstr(Section.getAllocAddress()) << "\n of size "
                  << Section.getOutputSize() << "\n at offset "
                  << Section.getOutputFileOffset() << '\n';
-    OS.pwrite(reinterpret_cast<const char *>(Section.getOutputData()),
-              Section.getOutputSize(), Section.getOutputFileOffset());
+    OS.seek(Section.getOutputFileOffset());
+    Section.write(OS);
   }
 
   for (BinarySection &Section : BC->allocatableSections())
diff --git a/bolt/test/eh-frame-overwrite.test b/bolt/test/eh-frame-overwrite.test
new file mode 100644
index 00000000000000..649d739ec6086a
--- /dev/null
+++ b/bolt/test/eh-frame-overwrite.test
@@ -0,0 +1,8 @@
+# Check that llvm-bolt can overwrite .eh_frame section in-place.
+
+REQUIRES: system-linux
+
+RUN: %clang %cflags %p/Inputs/hello.c -o %t -Wl,-q
+RUN: llvm-bolt %t -o %t.bolt --strict | FileCheck %s
+
+CHECK: rewriting .eh_frame in-place

Copy link
Member

@dcci dcci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments, otherwise looks reasonable to me.

@@ -4347,6 +4377,9 @@ RewriteInstance::getOutputSections(ELFObjectFile<ELFT> *File,
BinarySection *BinSec = BC->getSectionForSectionRef(SecRef);
assert(BinSec && "Matching BinarySection should exist.");

if (BinSec->isAnonymous())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is the matching bit of

      // Prevent the original section name from appearing in the section header
      // table.

in the rewriter. Can you add a comment? Do we have other anonymous sections outside of this one in the patch? Looks like this is a larger functional change (OK to have it here, but I want to understand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently use anonymous sections to emit jump tables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

MapSection(NewSection, OldSection.getAddress());

// Pad contents with zeros.
NewSection.addPadding(OldSection.getSize() - NewSection.getOutputSize());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the padding is required to make sure the size match? We can add that in a comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we overwrite the original contents, if the new section is smaller, there will be leftovers from the original section left in the tail. Hence, we write zeros to remove "junk" bytes.

@maksfb maksfb merged commit 9965532 into llvm:main Nov 19, 2024
4 of 5 checks passed
@maksfb maksfb deleted the gh-rewrite-eh-sections branch March 6, 2025 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants