Skip to content

[llvm-objcopy] Add --gap-fill and --pad-to options #65815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Dec 14, 2023

Conversation

quic-akaryaki
Copy link
Contributor

--gap-fill <value> fills the gaps between sections with a specified 8-bit value, instead of zero.
--pad-to <address> pads the output binary up to the specified load address, using the 8-bit value from --gap-fill or zero.

These options are only supported for ELF inputs and binary outputs.

`--gap-fill <value>` fills the gaps between sections with a specified
8-bit value, instead of zero.
`--pad-to <address>` pads the output binary up to the specified load
address, using the 8-bit value from `--gap-fill` or zero.

These options are only supported for ELF inputs and binary outputs.
@quic-akaryaki
Copy link
Contributor Author

This is an implementation of section padding with a more limited scope than was attempted in https://reviews.llvm.org/D67689.
Only binary outputs are supported, but ELF and IHEX are not. The main motivation for taking this approach was to support the clear use case of building padded ROM images without figuring all the murky details of the general ELF case.
After reviewing https://reviews.llvm.org/D67689, it seems the main problem was with how the section content is updated for padding. The SectionBase::OriginalData was modified, which is incorrect. However, I think the bigger problem is that there are multiple section types that represent their data differently (SectionBase::OriginalData, OwnedDataSection::Data, compressed sections, etc). In binutils, sections are represented uniformly and resizing them in one location is sufficient. I don't know if a similar simple solution is possible in llvm-objdump. Therefore, I think it would make sense to handle padding in the output stage.
I think which approach is better going forward ultimately depends on whether section padding is a valuable use case for ELF and IHEX outputs. I am not aware of such use cases, but suggestions are welcome. Talking about ROM images, it is possible in some platforms to directly flash an ihex. However, since ihex contains addresses, it supports gaps naturally and does not have to store all the padding bytes. During the flashing, the gaps are skipped.

@MaskRay
Copy link
Member

MaskRay commented Sep 8, 2023

(It seems that pr-subscribers-llvm-binary-utilities cannot be added as a reviewer)

@quic-akaryaki
Copy link
Contributor Author

I'm not familiar with the new github PR flow. I cannot add reviewers myself and there is no automatic assignment/recommendation.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your test cases don't appear to involve any program headers. It's my understanding that -O binary is usually related to loadable programs, which have program headers. Indeed, at least one of your test cases talks about segments, but doesn't use them at all! Does your test case need updating?

You should probably update ConfigManager.cpp to indicate that the option isn't supported on other platforms.

Change-Id: I7d57e3e87bd0393bbfe1b664228fe3bb0154d244
@llvmbot
Copy link
Member

llvmbot commented Sep 14, 2023

@llvm/pr-subscribers-llvm-binary-utilities

Changes `--gap-fill ` fills the gaps between sections with a specified 8-bit value, instead of zero. `--pad-to ` pads the output binary up to the specified load address, using the 8-bit value from `--gap-fill` or zero.

These options are only supported for ELF inputs and binary outputs.

Patch is 24.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65815.diff

12 Files Affected:

  • (modified) llvm/docs/CommandGuide/llvm-objcopy.rst (+10)
  • (modified) llvm/docs/ReleaseNotes.rst (+3)
  • (modified) llvm/include/llvm/ObjCopy/CommonConfig.h (+2)
  • (modified) llvm/lib/ObjCopy/ConfigManager.cpp (+8-4)
  • (modified) llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp (+1-1)
  • (modified) llvm/lib/ObjCopy/ELF/ELFObject.cpp (+29-2)
  • (modified) llvm/lib/ObjCopy/ELF/ELFObject.h (+4-1)
  • (added) llvm/test/tools/llvm-objcopy/ELF/gap-fill-order.test (+36)
  • (added) llvm/test/tools/llvm-objcopy/ELF/gap-fill.test (+155)
  • (added) llvm/test/tools/llvm-objcopy/ELF/pad-to.test (+94)
  • (modified) llvm/tools/llvm-objcopy/ObjcopyOptions.cpp (+31)
  • (modified) llvm/tools/llvm-objcopy/ObjcopyOpts.td (+12)
diff --git a/llvm/docs/CommandGuide/llvm-objcopy.rst b/llvm/docs/CommandGuide/llvm-objcopy.rst
index 0233a4f7b2f045f..6e13cd94b92fdaf 100644
--- a/llvm/docs/CommandGuide/llvm-objcopy.rst
+++ b/llvm/docs/CommandGuide/llvm-objcopy.rst
@@ -324,6 +324,11 @@ them.
 
  Extract the named partition from the output.
 
+.. option:: --gap-fill &lt;value&gt;
+
+ For binary outputs, fill the gaps between sections with ``&lt;value&gt;`` instead
+ of zero. The value must be an unsigned 8-bit integer.
+
 .. option:: --globalize-symbol &lt;symbol&gt;
 
  Mark any defined symbols named ``&lt;symbol&gt;`` as global symbols in the output.
@@ -411,6 +416,11 @@ them.
  be the same as the value specified for :option:`--input-target` or the input
  file&#x27;s format if that option is also unspecified.
 
+.. option:: --pad-to &lt;address&gt;
+
+ For binary outputs, pad the output to the load address ``&lt;address&gt;`` using a value
+ of zero or the value specified by :option:`--gap-fill`.
+
 .. option:: --prefix-alloc-sections &lt;prefix&gt;
 
  Add ``&lt;prefix&gt;`` to the front of the names of all allocatable sections in the
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 226ee606d17ee22..bad5dbc8a102473 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -167,6 +167,9 @@ Changes to the LLVM tools
 * llvm-symbolizer now treats invalid input as an address for which source
   information is not found.
 
+* llvm-objcopy now supports ``--gap-fill`` and ``--pad-to`` options, for
+  ELF input and binary output files only.
+
 Changes to LLDB
 ---------------------------------
 
diff --git a/llvm/include/llvm/ObjCopy/CommonConfig.h b/llvm/include/llvm/ObjCopy/CommonConfig.h
index e7ce1e6f2c54d75..386c20aec184ded 100644
--- a/llvm/include/llvm/ObjCopy/CommonConfig.h
+++ b/llvm/include/llvm/ObjCopy/CommonConfig.h
@@ -214,6 +214,8 @@ struct CommonConfig {
   // Cached gnu_debuglink&#x27;s target CRC
   uint32_t GnuDebugLinkCRC32;
   std::optional&lt;StringRef&gt; ExtractPartition;
+  uint8_t GapFill = 0;
+  uint64_t PadTo = 0;
   StringRef SplitDWO;
   StringRef SymbolsPrefix;
   StringRef AllocSectionsPrefix;
diff --git a/llvm/lib/ObjCopy/ConfigManager.cpp b/llvm/lib/ObjCopy/ConfigManager.cpp
index 5b8e2f5dc2003af..6074723279e8150 100644
--- a/llvm/lib/ObjCopy/ConfigManager.cpp
+++ b/llvm/lib/ObjCopy/ConfigManager.cpp
@@ -23,7 +23,8 @@ Expected&lt;const COFFConfig &amp;&gt; ConfigManager::getCOFFConfig() const {
       Common.ExtractDWO || Common.PreserveDates || Common.StripDWO ||
       Common.StripNonAlloc || Common.StripSections || Common.Weaken ||
       Common.DecompressDebugSections ||
-      Common.DiscardMode == DiscardType::Locals || !Common.SymbolsToAdd.empty())
+      Common.DiscardMode == DiscardType::Locals ||
+      !Common.SymbolsToAdd.empty() || Common.GapFill != 0 || Common.PadTo != 0)
     return createStringError(llvm::errc::invalid_argument,
                              &quot;option is not supported for COFF&quot;);
 
@@ -42,7 +43,8 @@ Expected&lt;const MachOConfig &amp;&gt; ConfigManager::getMachOConfig() const {
       Common.PreserveDates || Common.StripAllGNU || Common.StripDWO ||
       Common.StripNonAlloc || Common.StripSections || Common.Weaken ||
       Common.DecompressDebugSections || Common.StripUnneeded ||
-      Common.DiscardMode == DiscardType::Locals || !Common.SymbolsToAdd.empty())
+      Common.DiscardMode == DiscardType::Locals ||
+      !Common.SymbolsToAdd.empty() || Common.GapFill != 0 || Common.PadTo != 0)
     return createStringError(llvm::errc::invalid_argument,
                              &quot;option is not supported for MachO&quot;);
 
@@ -60,7 +62,8 @@ Expected&lt;const WasmConfig &amp;&gt; ConfigManager::getWasmConfig() const {
       !Common.SymbolsToWeaken.empty() || !Common.SymbolsToKeepGlobal.empty() ||
       !Common.SectionsToRename.empty() || !Common.SetSectionAlignment.empty() ||
       !Common.SetSectionFlags.empty() || !Common.SetSectionType.empty() ||
-      !Common.SymbolsToRename.empty())
+      !Common.SymbolsToRename.empty() || Common.GapFill != 0 ||
+      Common.PadTo != 0)
     return createStringError(llvm::errc::invalid_argument,
                              &quot;only flags for section dumping, removal, and &quot;
                              &quot;addition are supported&quot;);
@@ -86,7 +89,8 @@ Expected&lt;const XCOFFConfig &amp;&gt; ConfigManager::getXCOFFConfig() const {
       Common.ExtractMainPartition || Common.OnlyKeepDebug ||
       Common.PreserveDates || Common.StripAllGNU || Common.StripDWO ||
       Common.StripDebug || Common.StripNonAlloc || Common.StripSections ||
-      Common.Weaken || Common.StripUnneeded || Common.DecompressDebugSections) {
+      Common.Weaken || Common.StripUnneeded || Common.DecompressDebugSections ||
+      Common.GapFill != 0 || Common.PadTo != 0) {
     return createStringError(
         llvm::errc::invalid_argument,
         &quot;no flags are supported yet, only basic copying is allowed&quot;);
diff --git a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
index 58cb9cb65679b71..37d2af2139e317f 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
@@ -180,7 +180,7 @@ static std::unique_ptr&lt;Writer&gt; createWriter(const CommonConfig &amp;Config,
                                             ElfType OutputElfType) {
   switch (Config.OutputFormat) {
   case FileFormat::Binary:
-    return std::make_unique&lt;BinaryWriter&gt;(Obj, Out);
+    return std::make_unique&lt;BinaryWriter&gt;(Obj, Out, Config);
   case FileFormat::IHex:
     return std::make_unique&lt;IHexWriter&gt;(Obj, Out);
   default:
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index 697afab2a617b47..85a123d26ebfe42 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -2635,9 +2635,36 @@ template &lt;class ELFT&gt; Error ELFWriter&lt;ELFT&gt;::finalize() {
 }
 
 Error BinaryWriter::write() {
-  for (const SectionBase &amp;Sec : Obj.allocSections())
+  SmallVector&lt;const SectionBase *, 30&gt; LoadableSections;
+  for (const SectionBase &amp;Sec : Obj.allocSections()) {
+    if (Sec.Type != SHT_NOBITS)
+      LoadableSections.push_back(&amp;Sec);
+  }
+
+  if (LoadableSections.empty())
+    return Error::success();
+
+  llvm::stable_sort(LoadableSections,
+                    [](const SectionBase *LHS, const SectionBase *RHS) {
+                      return LHS-&gt;Offset &lt; RHS-&gt;Offset;
+                    });
+
+  assert(LoadableSections.front()-&gt;Offset == 0);
+
+  for (std::size_t i = 0; i != LoadableSections.size(); ++i) {
+    const SectionBase &amp;Sec = *LoadableSections[i];
     if (Error Err = Sec.accept(*SecWriter))
       return Err;
+    if (GapFill == 0)
+      continue;
+    uint64_t PadOffset = (i &lt; LoadableSections.size() - 1)
+                             ? LoadableSections[i + 1]-&gt;Offset
+                             : Buf-&gt;getBufferSize();
+    assert(PadOffset &lt;= Buf-&gt;getBufferSize());
+    assert(Sec.Offset + Sec.Size &lt;= PadOffset);
+    std::fill(Buf-&gt;getBufferStart() + Sec.Offset + Sec.Size,
+              Buf-&gt;getBufferStart() + PadOffset, GapFill);
+  }
 
   // TODO: Implement direct writing to the output stream (without intermediate
   // memory buffer Buf).
@@ -2663,7 +2690,7 @@ Error BinaryWriter::finalize() {
   // file size. This might not be the same as the offset returned by
   // layoutSections, because we want to truncate the last segment to the end of
   // its last non-empty section, to match GNU objcopy&#x27;s behaviour.
-  TotalSize = 0;
+  TotalSize = PadTo.value() &gt; MinAddr ? PadTo.value() - MinAddr : 0;
   for (SectionBase &amp;Sec : Obj.allocSections())
     if (Sec.Type != SHT_NOBITS &amp;&amp; Sec.Size &gt; 0) {
       Sec.Offset = Sec.Addr - MinAddr;
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.h b/llvm/lib/ObjCopy/ELF/ELFObject.h
index 89a03b3fe0ee354..bf6fa32908e3dcf 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.h
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.h
@@ -357,6 +357,8 @@ template &lt;class ELFT&gt; class ELFWriter : public Writer {
 
 class BinaryWriter : public Writer {
 private:
+  const uint8_t GapFill;
+  const std::optional&lt;uint64_t&gt; PadTo;
   std::unique_ptr&lt;BinarySectionWriter&gt; SecWriter;
 
   uint64_t TotalSize = 0;
@@ -365,7 +367,8 @@ class BinaryWriter : public Writer {
   ~BinaryWriter() {}
   Error finalize() override;
   Error write() override;
-  BinaryWriter(Object &amp;Obj, raw_ostream &amp;Out) : Writer(Obj, Out) {}
+  BinaryWriter(Object &amp;Obj, raw_ostream &amp;Out, const CommonConfig &amp;Config)
+      : Writer(Obj, Out), GapFill(Config.GapFill), PadTo(Config.PadTo) {}
 };
 
 class IHexWriter : public Writer {
diff --git a/llvm/test/tools/llvm-objcopy/ELF/gap-fill-order.test b/llvm/test/tools/llvm-objcopy/ELF/gap-fill-order.test
new file mode 100644
index 000000000000000..c154617bd3cd317
--- /dev/null
+++ b/llvm/test/tools/llvm-objcopy/ELF/gap-fill-order.test
@@ -0,0 +1,36 @@
+# RUN: yaml2obj %s -o %t
+
+## In this test, output sections are defined out of
+## order in respect to their load addresses. Verify
+## that gaps are still correctly filled.
+
+# RUN: llvm-objcopy -O binary --gap-fill=0xe9 %t %t-filled
+# RUN: od -v -Ax -t x1 %t-filled | FileCheck --match-full-lines %s
+# CHECK: 000000 aa bb cc dd e9 e9 e9 e9 11 22 33 44
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_X86_64
+Sections:
+  - Name:            .bss
+    Type:            SHT_NOBITS
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
+    Address:         0x0104
+    AddressAlign:    0x0001
+    Size:            4
+  - Name:            .section1
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
+    Address:         0x0108
+    AddressAlign:    0x0001
+    Content:         &#x27;11223344&#x27;
+  - Name:            .section3
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
+    Address:         0x0100
+    AddressAlign:    0x0001
+    Content:         &#x27;AABBCCDD&#x27;
+...
diff --git a/llvm/test/tools/llvm-objcopy/ELF/gap-fill.test b/llvm/test/tools/llvm-objcopy/ELF/gap-fill.test
new file mode 100644
index 000000000000000..5778a27d4bcaf0c
--- /dev/null
+++ b/llvm/test/tools/llvm-objcopy/ELF/gap-fill.test
@@ -0,0 +1,155 @@
+# RUN: yaml2obj %s -o %t
+
+## This test is partially based on one from D67689.
+
+# RUN: not llvm-objcopy --gap-fill 1 %t 2&gt;&amp;1 | FileCheck %s --check-prefix=NOT-BINARY
+# NOT-BINARY: error: &#x27;--gap-fill&#x27; is only supported for binary output
+
+# RUN: not llvm-objcopy -O binary --gap-fill %t 2&gt;&amp;1 | FileCheck %s --check-prefix=EMPTY
+# EMPTY: error: no input file specified
+
+# RUN: not llvm-objcopy -O binary --gap-fill= %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-FORMAT
+# BAD-FORMAT: error: --gap-fill: bad number:
+
+# RUN: not llvm-objcopy -O binary --gap-fill=x %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-INPUT
+# BAD-INPUT: error: --gap-fill: bad number: x
+
+# RUN: not llvm-objcopy -O binary --gap-fill=0x1G %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-INPUT2
+# BAD-INPUT2: error: --gap-fill: bad number: 0x1G
+
+# RUN: not llvm-objcopy -O binary --gap-fill=ff %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-INPUT3
+# BAD-INPUT3: error: --gap-fill: bad number: ff
+
+# RUN: llvm-objcopy -O binary --gap-fill=0x1122 %t %t-val16 2&gt;&amp;1 | FileCheck %s --check-prefix=TRUNCATED
+# TRUNCATED: warning: truncating gap-fill from 0x1122 to 0x22
+
+## Test no gap fill with all allocatable output sections.
+# RUN: llvm-objcopy -O binary %t %t-default
+# RUN: od -v -Ax -t x1 %t-default | FileCheck %s --check-prefix=DEFAULT --match-full-lines
+# DEFAULT:      000000 ee ff 11 22 33 44 aa bb cc dd fe dc ba 00 a1 b2
+# DEFAULT-NEXT: 000010 c3 d4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+# DEFAULT-NEXT: 000080 00 00 89 ab cd ef
+# DEFAULT-NEXT: 000086
+
+## Test gap fill with all allocatable output sections
+# RUN: llvm-objcopy -O binary --gap-fill=0xe9 %t %t-filled
+# RUN: od -v -Ax -t x1 %t-filled | FileCheck %s --check-prefix=FULL --match-full-lines
+# FULL:      000000 ee ff 11 22 33 44 aa bb cc dd fe dc ba e9 a1 b2
+# FULL-NEXT: 000010 c3 d4 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000020 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000030 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000040 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000050 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000060 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000070 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# FULL-NEXT: 000080 e9 e9 89 ab cd ef
+# FULL-NEXT: 000086
+
+## Test gap fill with a decimal value.
+# RUN: llvm-objcopy -O binary --gap-fill=99 %t %t-filled-decimal
+# RUN: od -v -Ax -t x1 %t-filled-decimal | FileCheck %s --check-prefix=DEC --match-full-lines
+# DEC:      000000 ee ff 11 22 33 44 aa bb cc dd fe dc ba 63 a1 b2
+# DEC-NEXT: 000010 c3 d4 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000020 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000030 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000040 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000050 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000060 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000070 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63 63
+# DEC-NEXT: 000080 63 63 89 ab cd ef
+# DEC-NEXT: 000086
+
+## Test gap fill with the last section removed, should be truncated.
+# RUN: llvm-objcopy -O binary --gap-fill=0xe9 --remove-section=.foo %t %t-filled
+# RUN: od -v -Ax -t x1 %t-filled | FileCheck %s --check-prefix=REMOVE-LAST-SECTION --match-full-lines
+# REMOVE-LAST-SECTION: 000000 ee ff 11 22 33 44 aa bb cc dd fe dc ba e9 a1 b2
+# REMOVE-LAST-SECTION-NEXT: 000010 c3 d4
+# REMOVE-LAST-SECTION-NEXT: 000012
+
+## Test gap fill with the middle section removed, should be filled.
+# RUN: llvm-objcopy -O binary --gap-fill=0xe9 --remove-section=.gap2 %t %t-filled
+# RUN: od -v -Ax -t x1 %t-filled | FileCheck %s --check-prefix=REMOVE-MIDDLE-SECTION --match-full-lines
+# REMOVE-MIDDLE-SECTION:      000000 ee ff 11 22 33 44 aa bb cc dd fe dc ba e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000010 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000020 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000030 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000040 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000050 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000060 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000070 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9 e9
+# REMOVE-MIDDLE-SECTION-NEXT: 000080 e9 e9 89 ab cd ef
+# REMOVE-MIDDLE-SECTION-NEXT: 000086
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_X86_64
+Sections:
+  - Name:            .space1
+    Type:            Fill
+    Pattern:         &#x27;ABCD&#x27;
+    Size:            0x2
+  - Name:            .nogap
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x0102
+    AddressAlign:    0x0001
+    Size:            0x6
+    Content:         &#x27;EEFF11223344&#x27;
+  - Name:            .gap1
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x0108
+    AddressAlign:    0x0001
+    Content:         &#x27;AABBCCDDFEDCBA&#x27;
+  - Name:            .space2
+    Type:            Fill
+    Pattern:         &#x27;DC&#x27;
+    Size:            1
+  - Name:            .gap2
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x0110
+    AddressAlign:    0x0001
+    Content:         &#x27;A1B2C3D4&#x27;
+  - Name:            .space3
+    Type:            Fill
+    Pattern:         &#x27;FE&#x27;
+    Size:            0x1
+  - Name:            .nobit_tbss
+    Type:            SHT_NOBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC, SHF_TLS ]
+    Address:         0x0180
+    AddressAlign:    0x0008
+    Size:            0x0018
+  - Name:            .space4
+    Type:            Fill
+    Pattern:         &#x27;01234567&#x27;
+    Size:            0x4
+  - Name:            .foo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x0184
+    AddressAlign:    0x0001
+    Content:         &#x27;89ABCDEF&#x27;
+  - Name:            .nobit_bss
+    Type:            SHT_NOBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x018A
+    AddressAlign:    0x0001
+    Size:            0x0008
+  - Name:            .comment
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_MERGE, SHF_STRINGS ]
+    AddressAlign:    0x0001
+    EntSize:         0x0001
+    Content:         4743433A
+...
diff --git a/llvm/test/tools/llvm-objcopy/ELF/pad-to.test b/llvm/test/tools/llvm-objcopy/ELF/pad-to.test
new file mode 100644
index 000000000000000..99f28b5d4b6ef70
--- /dev/null
+++ b/llvm/test/tools/llvm-objcopy/ELF/pad-to.test
@@ -0,0 +1,94 @@
+# RUN: yaml2obj %s -o %t
+
+# RUN: not llvm-objcopy --pad-to=1 %t 2&gt;&amp;1 | FileCheck %s --check-prefix=NOT-BINARY
+# NOT-BINARY: error: &#x27;--pad-to&#x27; is only supported for binary output
+
+# RUN: not llvm-objcopy -O binary --pad-to= %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-FORMAT
+# BAD-FORMAT: error: --pad-to: bad number:
+
+# RUN: not llvm-objcopy -O binary --pad-to=x %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-INPUT
+# BAD-INPUT: error: --pad-to: bad number: x
+
+# RUN: not llvm-objcopy -O binary --pad-to=0x1G %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-INPUT2
+# BAD-INPUT2: error: --pad-to: bad number: 0x1G
+
+# RUN: not llvm-objcopy -O binary --pad-to=ff %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-INPUT3
+# BAD-INPUT3: error: --pad-to: bad number: ff
+
+# RUN: not llvm-objcopy -O binary --pad-to=0x112233445566778899 %t 2&gt;&amp;1 | FileCheck %s --check-prefix=BAD-NUMBER
+# BAD-NUMBER: error: --pad-to: bad number: 0x112233445566778899
+
+## Save the baseline, not padded output.
+# RUN: llvm-objcopy -O binary %t %t.bin
+
+## Pad to a address smaller than the binary size.
+# RUN: llvm-objcopy -O binary --pad-to=0x20 %t %t-p1
+# RUN: cmp %t.bi...

@quic-akaryaki
Copy link
Contributor Author

Your test cases don't appear to involve any program headers. It's my understanding that -O binary is usually related to loadable programs, which have program headers. Indeed, at least one of your test cases talks about segments, but doesn't use them at all! Does your test case need updating?

You should probably update ConfigManager.cpp to indicate that the option isn't supported on other platforms.

objcopy does not use program headers for binary output. It only uses load addresses from section headers. I think adding them to the test would be redundant.
I have updated ConfigManager.cpp to add the new options to the list of unsupported options.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably have more comments at a later date, but I'll let you update this PR to address my comments before making them.

quic-akaryaki and others added 2 commits October 4, 2023 08:53
@quic-akaryaki quic-akaryaki requested a review from jh7370 October 10, 2023 15:42
Use reportWarning(), some renaming and rewording and other miscellaneous
changes.
@quic-akaryaki
Copy link
Contributor Author

quic-akaryaki commented Oct 30, 2023

I created another PR related to missing argument values: #70710.
It is a problem in objcopy as the LLVM args parser detects a missing value, but objcopy (and many other tools) ignores it.
That will be a breaking change.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically looks good. Just some minor nits now.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like @MaskRay to give this a once-over before merging this, but otherwise LGTM.

@quic-akaryaki
Copy link
Contributor Author

@MaskRay please share your opinion. Thank you.

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay..

Type: SHT_PROGBITS
Flags: [ SHF_ALLOC ]
Address: 0x0102
AddressAlign: 0x0001
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can omit AddressAlign fields which are unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not done. There are still many AddressAlign: 0x0001

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now.

## In this test, output sections are defined out of order with respect to their
## load addresses. Verify that gaps are still correctly filled.

# RUN: yaml2obj --docnum=2 %s -o %t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a different output file name than yaml2obj --docnum=1 %s -o %t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

A->getValue());
uint8_t ByteVal = Val.get();
if (ByteVal != Val.get())
if (Error E = reportWarning(llvm::createStringError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that objcopy uses a warning, but an error seems more appropriate and the difference between warning/error should not matter for the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I left the changes to reportWarning as they may be useful in the future (and its less work not to undo them).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jh7370: I think you should undo the reportWarning changes: there's no guarantee anybody will need it in the future, so you shouldn't change it now.

Agreed. We should remove the reportWarning definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

defm gap_fill
: Eq<"gap-fill", "Fill the gaps between sections with <value> instead of zero. "
"<value> must be an unsigned 8-bit integer. "
"This option is only supported for ELF inputs and binary outputs.">,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help message typically omit the trailing period. I think you can use the singular form ... binary output">

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

defm pad_to
: Eq<"pad-to", "Pad the output to the load address <address>, using a value "
"of zero or the value specified by :option:`--gap-fill`"
"This option is only supported for ELF inputs and binary outputs.">,
Copy link
Member

@MaskRay MaskRay Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help message typically omit the trailing period

objcopy help message says "pad the output file up to the load address".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I also removed formatting, not sure why I added it here.

defm pad_to
: Eq<"pad-to", "Pad the output to the load address <address>, using a value "
"of zero or the value specified by :option:`--gap-fill`"
"This option is only supported for ELF inputs and binary outputs.">,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing . before This option ...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

quic-akaryaki and others added 2 commits December 11, 2023 10:27
Change-Id: I3c2b19888c9e4d397a191861136c48b5715155c6
Copy link

github-actions bot commented Dec 11, 2023

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

Change-Id: I44945026180d959b9616d70b910321cfc8476eff
Change-Id: Icd803193dd02f63959b56a6f9e404c4751cc933b
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should undo the reportWarning changes: there's no guarantee anybody will need it in the future, so you shouldn't change it now.

@@ -234,11 +234,11 @@ defm update_section
defm gap_fill
: Eq<"gap-fill", "Fill the gaps between sections with <value> instead of zero. "
"<value> must be an unsigned 8-bit integer. "
"This option is only supported for ELF inputs and binary outputs.">,
"This option is only supported for the ELF input and binary output">,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want "the" before "ELF".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

"This option is only supported for ELF inputs and binary outputs.">,
: Eq<"pad-to", "Pad the output up to the load address <address>, using a value "
"of zero or the value specified by the --gap-fill option. "
"This option is only supported for the ELF input and binary output">,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto: no "the" before "ELF".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@quic-akaryaki quic-akaryaki merged commit 4070dff into llvm:main Dec 14, 2023
@jansvoboda11
Copy link
Contributor

The new tests started failing on Apple build bot: https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/38908/
Can you please fix these or revert the commit?

@quic-akaryaki
Copy link
Contributor Author

Thanks for letting me know, I'm preparing a fix. By the way, is there a way to run a buildbot manually?

@jansvoboda11
Copy link
Contributor

Thanks for letting me know, I'm preparing a fix. By the way, is there a way to run a buildbot manually?

Thanks! Not that I know of.

quic-akaryaki added a commit to quic-akaryaki/llvm-project that referenced this pull request Dec 15, 2023
The tests added in PR llvm#65815 fail on Apple buildbot because the `od`
printed addresses have a different number of leading zeroes. Mask
leading zeroes with a regex.
@quic-akaryaki
Copy link
Contributor Author

Trying a fix in #75631.

quic-akaryaki added a commit that referenced this pull request Dec 15, 2023
The tests added in PR #65815 fail on Apple buildbot because the `od`
printed addresses have a different number of leading zeroes. Mask
leading zeroes with a regex.
To support the `od` output format on z/OS, add `--ignore-case` to FileCheck.
@quic-akaryaki quic-akaryaki deleted the gap-fill-pad-to-binary branch April 2, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants