Skip to content

[llvm-objcopy] Add change-section-lma *+/-offset #95431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 8, 2024

Conversation

eleanor-arm
Copy link
Contributor

@eleanor-arm eleanor-arm commented Jun 13, 2024

llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.

llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software.
@llvmbot
Copy link
Member

llvmbot commented Jun 13, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: Eleanor Bonnici (eleanor-arm)

Changes

llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software.


Full diff: https://github.com/llvm/llvm-project/pull/95431.diff

7 Files Affected:

  • (modified) llvm/docs/CommandGuide/llvm-objcopy.rst (+4)
  • (modified) llvm/include/llvm/ObjCopy/CommonConfig.h (+3)
  • (modified) llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp (+7)
  • (added) llvm/test/tools/llvm-objcopy/ELF/change-section-lma-multiple-load.test (+52)
  • (added) llvm/test/tools/llvm-objcopy/ELF/change-section-lma-single-load.test (+45)
  • (modified) llvm/tools/llvm-objcopy/ObjcopyOptions.cpp (+35)
  • (modified) llvm/tools/llvm-objcopy/ObjcopyOpts.td (+8)
diff --git a/llvm/docs/CommandGuide/llvm-objcopy.rst b/llvm/docs/CommandGuide/llvm-objcopy.rst
index a62acfc8fdcd8..d1e2ddda6ec36 100644
--- a/llvm/docs/CommandGuide/llvm-objcopy.rst
+++ b/llvm/docs/CommandGuide/llvm-objcopy.rst
@@ -528,6 +528,10 @@ them.
 
  Mark all defined global symbols as weak in the output.
 
+.. option:: --change-section-lma \*{+-}<offset>
+
+ Currently only supports changing LMA for all sections by the same offset.
+
 MACH-O-SPECIFIC OPTIONS
 -----------------------
 
diff --git a/llvm/include/llvm/ObjCopy/CommonConfig.h b/llvm/include/llvm/ObjCopy/CommonConfig.h
index ae08d4032736e..7f9d90d528b3e 100644
--- a/llvm/include/llvm/ObjCopy/CommonConfig.h
+++ b/llvm/include/llvm/ObjCopy/CommonConfig.h
@@ -245,6 +245,9 @@ struct CommonConfig {
   // Symbol info specified by --add-symbol option.
   SmallVector<NewSymbolInfo, 0> SymbolsToAdd;
 
+  // Integer options
+  int64_t ChangeSectionLMAValAll = 0;
+
   // Boolean options
   bool DeterministicArchives = true;
   bool ExtractDWO = false;
diff --git a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
index f343d1447e055..32e6f5f751a03 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
@@ -670,6 +670,13 @@ static Error handleArgs(const CommonConfig &Config, const ELFConfig &ELFConfig,
     }
   }
 
+  if (Config.ChangeSectionLMAValAll != 0) {
+    for (auto &Seg : Obj.segments()) {
+      if (Seg.FileSize > 0)
+        Seg.PAddr += Config.ChangeSectionLMAValAll;
+    }
+  }
+
   if (Config.OnlyKeepDebug)
     for (auto &Sec : Obj.sections())
       if (Sec.Flags & SHF_ALLOC && Sec.Type != SHT_NOTE)
diff --git a/llvm/test/tools/llvm-objcopy/ELF/change-section-lma-multiple-load.test b/llvm/test/tools/llvm-objcopy/ELF/change-section-lma-multiple-load.test
new file mode 100644
index 0000000000000..70aa8ef386ea2
--- /dev/null
+++ b/llvm/test/tools/llvm-objcopy/ELF/change-section-lma-multiple-load.test
@@ -0,0 +1,52 @@
+# RUN: yaml2obj %s -o %t
+# RUN: llvm-objcopy --change-section-lma *+0x20 %t %t2
+# RUN: llvm-readelf -l %t2 | FileCheck %s --check-prefix=CHECK-PLUS
+# RUN: llvm-objcopy --change-section-lma *-0x10 %t2 %t3
+# RUN: llvm-readelf -l %t3 | FileCheck %s --check-prefix=CHECK-MINUS
+# RUN: not llvm-objcopy --change-section-lma .text3=0x5000 %t  2>&1 | FileCheck %s --check-prefix=ERR
+
+!ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_AARCH64
+Sections:
+  - Name:            .text1
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x1000
+    AddressAlign:    0x1000
+    Content:         "DEADBEEF"
+    Size:            0x1000
+  - Name:            .text2
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x2000
+    AddressAlign:    0x1000
+    Content:         "32323232"
+    Size:            0x1000
+  - Name:            .text3
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x3000
+    AddressAlign:    0x1000
+    Content:         "c3c3c3c3"
+    Size:            0x1000
+ProgramHeaders:
+  - Type:     PT_LOAD
+    Flags:    [ PF_R ]
+    VAddr:    0x1000
+    FirstSec: .text1
+    LastSec:  .text2
+# CHECK-PLUS:  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz
+# CHECK-PLUS:  LOAD           0x001000 0x0000000000001000 0x0000000000001020 0x002000 0x002000
+# CHECK-MINUS: LOAD           0x001000 0x0000000000001000 0x0000000000001010 0x002000 0x002000
+  - Type:     PT_LOAD
+    Flags:    [ PF_R ]
+    VAddr:    0x7000
+    FirstSec: .text3
+    LastSec:  .text3
+# CHECK-PLUS:  LOAD           0x003000 0x0000000000007000 0x0000000000007020 0x001000 0x001000
+# CHECK-MINUS: LOAD           0x003000 0x0000000000007000 0x0000000000007010 0x001000 0x001000
+# ERR: error: bad format for --change-section-lma: it is required that all sections are either incremented, or decremented at the same time; use *+val, or *-val
diff --git a/llvm/test/tools/llvm-objcopy/ELF/change-section-lma-single-load.test b/llvm/test/tools/llvm-objcopy/ELF/change-section-lma-single-load.test
new file mode 100644
index 0000000000000..439ea81bf0b0f
--- /dev/null
+++ b/llvm/test/tools/llvm-objcopy/ELF/change-section-lma-single-load.test
@@ -0,0 +1,45 @@
+# RUN: yaml2obj %s -o %t
+# RUN: llvm-objcopy --change-section-lma *+0x20 %t %t2
+# RUN: llvm-readelf -l %t2 | FileCheck %s --check-prefix=CHECK-PLUS
+# RUN: llvm-objcopy --change-section-lma *-0x10 %t2 %t3
+# RUN: llvm-readelf -l %t3 | FileCheck %s --check-prefix=CHECK-MINUS
+# RUN: not llvm-objcopy --change-section-lma .text3=0x5000 %t  2>&1 | FileCheck %s --check-prefix=ERR
+
+!ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_AARCH64
+Sections:
+  - Name:            .text1
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x1000
+    AddressAlign:    0x1000
+    Content:         "DEADBEEF"
+    Size:            0x1000
+  - Name:            .text2
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x2000
+    AddressAlign:    0x1000
+    Content:         "32323232"
+    Size:            0x1000
+  - Name:            .text3
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x3000
+    AddressAlign:    0x1000
+    Content:         "c3c3c3c3"
+    Size:            0x1000
+ProgramHeaders:
+  - Type:     PT_LOAD
+    Flags:    [ PF_R ]
+    VAddr:    0x1000
+    FirstSec: .text1
+    LastSec:  .text3
+# CHECK-PLUS:  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz
+# CHECK-PLUS:  LOAD           0x001000 0x0000000000001000 0x0000000000001020 0x003000 0x003000
+# CHECK-MINUS: LOAD           0x001000 0x0000000000001000 0x0000000000001010 0x003000 0x003000
+# ERR: error: bad format for --change-section-lma: it is required that all sections are either incremented, or decremented at the same time; use *+val, or *-val
diff --git a/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp b/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
index 4ab3b7265f2f6..c5f3c971a5599 100644
--- a/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
+++ b/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
@@ -552,6 +552,33 @@ static Error loadNewSectionData(StringRef ArgValue, StringRef OptionName,
   return Error::success();
 }
 
+static Expected<int64_t> parseAdjustSectionLMA(StringRef ArgValue,
+                                               StringRef OptionName) {
+  StringRef StringValue;
+  if (ArgValue.starts_with("*+")) {
+    StringValue = ArgValue.slice(2, StringRef::npos);
+  } else if (ArgValue.starts_with("*-")) {
+    StringValue = ArgValue.slice(1, StringRef::npos);
+  } else {
+    return createStringError(errc::invalid_argument,
+                             "bad format for " + OptionName +
+                                 ": it is required that all sections "
+                                 "are either incremented, or decremented at "
+                                 "the same time; use *+val, "
+                                 "or *-val");
+  }
+  if (StringValue.empty())
+    return createStringError(errc::invalid_argument,
+                             "bad format for " + OptionName +
+                                 ": missing offset of LMA");
+
+  auto SLMAV = getAsInteger<int64_t>(StringValue);
+  if (!SLMAV)
+    return createStringError(SLMAV.getError(),
+                             "Unable to parse adjustment value");
+  return *SLMAV;
+}
+
 // parseObjcopyOptions returns the config and sets the input arguments. If a
 // help flag is set then parseObjcopyOptions will print the help messege and
 // exit.
@@ -833,6 +860,14 @@ objcopy::parseObjcopyOptions(ArrayRef<const char *> RawArgsArr,
     Config.PadTo = *Addr;
   }
 
+  if (const auto *Arg = InputArgs.getLastArg(OBJCOPY_adjust_section_lma)) {
+    Expected<int64_t> SLMAV =
+        parseAdjustSectionLMA(Arg->getValue(), Arg->getSpelling());
+    if (!SLMAV)
+      return SLMAV.takeError();
+    Config.ChangeSectionLMAValAll = *SLMAV;
+  }
+
   for (auto *Arg : InputArgs.filtered(OBJCOPY_redefine_symbol)) {
     if (!StringRef(Arg->getValue()).contains('='))
       return createStringError(errc::invalid_argument,
diff --git a/llvm/tools/llvm-objcopy/ObjcopyOpts.td b/llvm/tools/llvm-objcopy/ObjcopyOpts.td
index 4bc80eba05f8e..44e9963ec1cb0 100644
--- a/llvm/tools/llvm-objcopy/ObjcopyOpts.td
+++ b/llvm/tools/llvm-objcopy/ObjcopyOpts.td
@@ -254,6 +254,14 @@ def adjust_start : JoinedOrSeparate<["--"], "adjust-start">,
                    Alias<change_start>,
                    HelpText<"Alias for --change-start">;
 
+defm adjust_section_lma
+    : Eq<"change-section-lma",
+         "Change LMA of all sections by <val>. LMA is the physical address where the section is "
+         "loaded in the memory; as opoosed to VMA, which is the virtual address at which the "
+         "section resides during program execution. Ussually LMA and VMA are the same, but some "
+         "platforms support these to be different resulting in runtime copying">,
+      MetaVarName<"*{+|-}val">;
+
 defm add_symbol
     : Eq<"add-symbol", "Add new symbol <name> to .symtab. Accepted flags: "
          "global, local, weak, default, hidden, protected, file, section, object, "

@jh7370 jh7370 requested a review from MaskRay June 14, 2024 08:07
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe what the actual behaviour is when GNU objcopy uses this option, please?

Also, could you go into more detail about what your use-case is, please?

@eleanor-arm eleanor-arm force-pushed the objcopy branch 2 times, most recently from fdbd32a to 304a839 Compare June 18, 2024 16:36
@eleanor-arm eleanor-arm requested a review from jh7370 June 20, 2024 09:06
llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
@eleanor-arm eleanor-arm requested a review from jh7370 June 29, 2024 15:43
Data: ELFDATA2LSB
Type: ET_EXEC
Machine: EM_AARCH64
Sections:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fair enough. I've actually changed my mind about not having any sections anyway, as I feel like it would be good to verify that the section sh_addr value isn't changed, as part of "documenting what the feature does through tests".

llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
@eleanor-arm
Copy link
Contributor Author

it would be good to verify that the section sh_addr value isn't changed, as part of "documenting what the feature does through tests".

I've tried to add this to the test. Is it clear enough what it's showing?

@eleanor-arm eleanor-arm requested a review from jh7370 July 1, 2024 11:01
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly fine, but it looks like my last round of test comments haven't been addressed yet?

Comment on lines +3 to +4
# RUN: llvm-readelf --program-headers %t2 | FileCheck %s --check-prefix=CHECK-PLUS-PROGRAMS
# RUN: llvm-readelf --section-headers %t2 | FileCheck %s --check-prefix=CHECK-PLUS-SECTIONS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simplify this and the similar below down to a single call with both options specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the original version had them in one command but I "simplified" it by having two separate commands. I think it makes it clearly to see what is being checked.

llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
@eleanor-arm
Copy link
Contributor Author

I've responded to all the unresolved comments. Hopefully, this should cover it all.

@eleanor-arm eleanor-arm requested a review from jh7370 July 5, 2024 15:10
llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise LGTM.

llvm-objcopy did not support change-section-lma argument.

This patch adds support for a use case of change-section-lma, that is
shifting load address of all sections by the same offset. This seems to
be the only practical use case of change-section-lma, found in other
software such as Zephyr RTOS's build system.

This is an option that could possibly be supported in some other than
ELF formats, however this change only implements it for ELF. When used
with other formats an error message is raised.

In comparison, the behavior of GNU objcopy is inconsistent. For some ELF
files it behaves the same as described above. For others, it copies the
file without modifying the p_paddr fields when it would be expected. In
some experiments it modifies arbitrary fields in section or program
headers. It is unclear what exactly determines this.
The executable file generated by yaml2obj in this test is not parsable
by GNU objcopy. With Machine set to EM_AARCH64, the file can be parsed
and the first test in the test file completes with 0 exit code. However,
the result is rather arbitrary. AArch64 GNU objcopy subtracts 0x1000
from p_filesz and p_memsz of the first LOAD section and 0x1000 from
p_offset of the second LOAD section. It does not look meaningful.
@eleanor-arm eleanor-arm merged commit 3320036 into llvm:main Jul 8, 2024
8 checks passed
@eleanor-arm eleanor-arm deleted the objcopy branch July 8, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants