Skip to content

Reland "[ObjectYAML][ELF] Take alignment into account when generating notes" #118434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 5, 2024

Conversation

igorkudrin
Copy link
Collaborator

This relands #118157 with a fix for the use of an uninitialized variable.

The System V ABI states that the note entries and their descriptor fields must be aligned to 4 or 8 bytes for 32-bit or 64-bit objects respectively. In practice, 64-bit systems can use both alignments, with the actual format being determined by the alignment of the segment. For example, the Linux gABI extension contains a special note on this, see 2.1.7 "Alignment of Note Sections".

This patch adjusts the format of the generated notes to the specified section alignment. Since llvm-readobj was fixed in a similar way in D150022, "[Object] Fix handling of Elf_Nhdr with sh_addralign=8", the generated notes can now be parsed successfully by the tool.

… notes"

This relands llvm#118157 with a fix for the use of an uninitialised variable.

The [System V ABI](https://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section)
states that the note entries and their descriptor fields must be aligned
to 4 or 8 bytes for 32-bit or 64-bit objects respectively. In practice,
64-bit systems can use both alignments, with the actual format being
determined by the alignment of the segment. For example, the
[Linux gABI extension](https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf)
contains a special note on this, see 2.1.7 "Alignment of Note Sections".

This patch adjusts the format of the generated notes to the specified
section alignment. Since `llvm-readobj` was fixed in a similar way in
[D150022](https://reviews.llvm.org/D150022), "[Object] Fix handling of
Elf_Nhdr with sh_addralign=8", the generated notes can now be parsed
successfully by the tool.
@llvmbot
Copy link
Member

llvmbot commented Dec 3, 2024

@llvm/pr-subscribers-objectyaml

Author: Igor Kudrin (igorkudrin)

Changes

This relands #118157 with a fix for the use of an uninitialized variable.

The System V ABI states that the note entries and their descriptor fields must be aligned to 4 or 8 bytes for 32-bit or 64-bit objects respectively. In practice, 64-bit systems can use both alignments, with the actual format being determined by the alignment of the segment. For example, the Linux gABI extension contains a special note on this, see 2.1.7 "Alignment of Note Sections".

This patch adjusts the format of the generated notes to the specified section alignment. Since llvm-readobj was fixed in a similar way in D150022, "[Object] Fix handling of Elf_Nhdr with sh_addralign=8", the generated notes can now be parsed successfully by the tool.


Full diff: https://github.com/llvm/llvm-project/pull/118434.diff

2 Files Affected:

  • (modified) llvm/lib/ObjectYAML/ELFEmitter.cpp (+18-2)
  • (modified) llvm/test/tools/yaml2obj/ELF/note-section.yaml (+129)
diff --git a/llvm/lib/ObjectYAML/ELFEmitter.cpp b/llvm/lib/ObjectYAML/ELFEmitter.cpp
index 476334024151a9..5833d100656ede 100644
--- a/llvm/lib/ObjectYAML/ELFEmitter.cpp
+++ b/llvm/lib/ObjectYAML/ELFEmitter.cpp
@@ -1799,6 +1799,21 @@ void ELFState<ELFT>::writeSectionContent(Elf_Shdr &SHeader,
   if (!Section.Notes)
     return;
 
+  unsigned Align;
+  switch (SHeader.sh_addralign) {
+  case 0:
+  case 4:
+    Align = 4;
+    break;
+  case 8:
+    Align = 8;
+    break;
+  default:
+    reportError(Section.Name + ": invalid alignment for a note section: 0x" +
+                Twine::utohexstr(SHeader.sh_addralign));
+    return;
+  }
+
   uint64_t Offset = CBA.tell();
   for (const ELFYAML::NoteEntry &NE : *Section.Notes) {
     // Write name size.
@@ -1820,14 +1835,15 @@ void ELFState<ELFT>::writeSectionContent(Elf_Shdr &SHeader,
     if (!NE.Name.empty()) {
       CBA.write(NE.Name.data(), NE.Name.size());
       CBA.write('\0');
-      CBA.padToAlignment(4);
     }
 
     // Write description and padding.
     if (NE.Desc.binary_size() != 0) {
+      CBA.padToAlignment(Align);
       CBA.writeAsBinary(NE.Desc);
-      CBA.padToAlignment(4);
     }
+
+    CBA.padToAlignment(Align);
   }
 
   SHeader.sh_size = CBA.tell() - Offset;
diff --git a/llvm/test/tools/yaml2obj/ELF/note-section.yaml b/llvm/test/tools/yaml2obj/ELF/note-section.yaml
index 80359c4ec01833..26b95e1c2379b2 100644
--- a/llvm/test/tools/yaml2obj/ELF/note-section.yaml
+++ b/llvm/test/tools/yaml2obj/ELF/note-section.yaml
@@ -333,3 +333,132 @@ Sections:
       - Name: ABC
         Desc: '123456'
         Type: NT_VERSION
+
+## Check that an incorrect alignment is reported
+
+# RUN: not yaml2obj --docnum=16 %s 2>&1 | FileCheck %s --check-prefix=ERR_ALIGN1
+# ERR_ALIGN1: error: .note.foo: invalid alignment for a note section: 0x1
+
+--- !ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data:  ELFDATA2LSB
+  Type:  ET_EXEC
+Sections:
+  - Name:         .note.foo
+    Type:         SHT_NOTE
+    AddressAlign: 1
+    Notes:
+      - Type: 0x1
+
+## Check that note entries and their `Desc` fields are aligned according to the
+## specified section alignment
+
+# RUN: yaml2obj --docnum=17 %s -o - | \
+# RUN:   llvm-readobj --sections --section-data --notes - | \
+# RUN:   FileCheck %s --check-prefix=TEST17
+
+# TEST17:      Name: .note.foo4
+# TEST17:      SectionData (
+# TEST17-NEXT:   0000: 05000000 02000000 01000000 41424344  |............ABCD|
+# TEST17-NEXT:   0010: 00000000 01020000 00000000 03000000  |................|
+# TEST17-NEXT:   0020: 02000000 03040500 04000000 00000000  |................|
+# TEST17-NEXT:   0030: 03000000 474E5500                    |....GNU.|
+# TEST17-NEXT: )
+# TEST17:      Name: .note.foo8
+# TEST17:      SectionData (
+# TEST17-NEXT:   0000: 05000000 02000000 01000000 41424344  |............ABCD|
+# TEST17-NEXT:   0010: 00000000 00000000 01020000 00000000  |................|
+# TEST17-NEXT:   0020: 00000000 03000000 02000000 00000000  |................|
+# TEST17-NEXT:   0030: 03040500 00000000 04000000 00000000  |................|
+# TEST17-NEXT:   0040: 03000000 474E5500                    |....GNU.|
+# TEST17-NEXT: )
+# TEST17:      NoteSections [
+# TEST17-NEXT:   NoteSection {
+# TEST17-NEXT:     Name: .note.foo4
+# TEST17-NEXT:     Offset:
+# TEST17-NEXT:     Size:
+# TEST17-NEXT:     Notes [
+# TEST17-NEXT:       {
+# TEST17-NEXT:         Owner: ABCD
+# TEST17-NEXT:         Data size: 0x2
+# TEST17-NEXT:         Type: NT_VERSION (version)
+# TEST17-NEXT:         Description data (
+# TEST17-NEXT:           0000: 0102                                 |..|
+# TEST17-NEXT:         )
+# TEST17-NEXT:       }
+# TEST17-NEXT:       {
+# TEST17-NEXT:         Owner: 
+# TEST17-NEXT:         Data size: 0x3
+# TEST17-NEXT:         Type: NT_ARCH (architecture)
+# TEST17-NEXT:         Description data (
+# TEST17-NEXT:           0000: 030405                               |...|
+# TEST17-NEXT:         )
+# TEST17-NEXT:       }
+# TEST17-NEXT:       {
+# TEST17-NEXT:         Owner: GNU
+# TEST17-NEXT:         Data size: 0x0
+# TEST17-NEXT:         Type: NT_GNU_BUILD_ID (unique build ID bitstring)
+# TEST17-NEXT:         Build ID: 
+# TEST17-NEXT:       }
+# TEST17-NEXT:     ]
+# TEST17-NEXT:   }
+# TEST17-NEXT:   NoteSection {
+# TEST17-NEXT:     Name: .note.foo8
+# TEST17-NEXT:     Offset:
+# TEST17-NEXT:     Size:
+# TEST17-NEXT:     Notes [
+# TEST17-NEXT:       {
+# TEST17-NEXT:         Owner: ABCD
+# TEST17-NEXT:         Data size: 0x2
+# TEST17-NEXT:         Type: NT_VERSION (version)
+# TEST17-NEXT:         Description data (
+# TEST17-NEXT:           0000: 0102                                 |..|
+# TEST17-NEXT:         )
+# TEST17-NEXT:       }
+# TEST17-NEXT:       {
+# TEST17-NEXT:         Owner: 
+# TEST17-NEXT:         Data size: 0x3
+# TEST17-NEXT:         Type: NT_ARCH (architecture)
+# TEST17-NEXT:         Description data (
+# TEST17-NEXT:           0000: 030405                               |...|
+# TEST17-NEXT:         )
+# TEST17-NEXT:       }
+# TEST17-NEXT:       {
+# TEST17-NEXT:         Owner: GNU
+# TEST17-NEXT:         Data size: 0x0
+# TEST17-NEXT:         Type: NT_GNU_BUILD_ID (unique build ID bitstring)
+# TEST17-NEXT:         Build ID: 
+# TEST17-NEXT:       }
+# TEST17-NEXT:     ]
+# TEST17-NEXT:   }
+# TEST17-NEXT: ]
+
+--- !ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data:  ELFDATA2LSB
+  Type:  ET_EXEC
+Sections:
+  - Name:         .note.foo4
+    Type:         SHT_NOTE
+    AddressAlign: 4
+    Notes:
+      - Name: ABCD
+        Type: NT_VERSION
+        Desc: 0102
+      - Type: NT_ARCH
+        Desc: 030405
+      - Name: GNU
+        Type: NT_GNU_BUILD_ID
+  - Name:        .note.foo8
+    Type:         SHT_NOTE
+    AddressAlign: 8
+    Notes:
+      - Name: ABCD
+        Type: NT_VERSION
+        Desc: 0102
+      - Type: NT_ARCH
+        Desc: 030405
+      - Name: GNU
+        Type: NT_GNU_BUILD_ID

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what we should do for ELF32 instances, where the alignment is 8. In the spirit of yaml2obj's permissiveness, I think we should allow it, in which case we should have a test for it.

More generally, yaml2obj tries to be permissive in what it allows. In other words, it should try to write the object described regardless of the ABI rules on what is permitted. What's preventing us aligning each note to e.g. 2 or 3?

@igorkudrin
Copy link
Collaborator Author

I'm wondering what we should do for ELF32 instances, where the alignment is 8. In the spirit of yaml2obj's permissiveness, I think we should allow it, in which case we should have a test for it.

I don't see anything wrong with yanl2obj generating 8-byte aligned notes for ELF32, so I've added the test.

More generally, yaml2obj tries to be permissive in what it allows. In other words, it should try to write the object described regardless of the ABI rules on what is permitted. What's preventing us aligning each note to e.g. 2 or 3?

Aligning notes to anything other than 4 or 8 makes no sense, because parsers would not be able to read them correctly. If a test writer needs a nonsense notes section, they can always use Content: instead of Notes:. Some existing tests even do this, for example, tools/llvm-objcopy/ELF/only-keep-debug.test and tools/llvm-readobj/ELF/note-alignment-invalid.test.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligning notes to anything other than 4 or 8 makes no sense, because parsers would not be able to read them correctly. If a test writer needs a nonsense notes section, they can always use Content: instead of Notes:. Some existing tests even do this, for example, tools/llvm-objcopy/ELF/only-keep-debug.test and tools/llvm-readobj/ELF/note-alignment-invalid.test.

That's fair, thanks (I'm less concerned about generating something parsers could read with this, but the point about being able to use Content for invalid data is compelling enough for me).

@@ -1810,7 +1810,7 @@ void ELFState<ELFT>::writeSectionContent(Elf_Shdr &SHeader,
break;
default:
reportError(Section.Name + ": invalid alignment for a note section: 0x" +
Twine::utohexstr(SHeader.sh_addralign));
Twine::utohexstr(Section.AddressAlign));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly not worth it, but I wonder if a test is in order to highlight which alignment is actually used (i.e. Section.AddressAlign, not SHeader.sh_addralign)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a test can check this because Sec->AddressAlign is assigned to SHeader.sh_addralign in ELFState<ELFT>::initSectionHeaders() just a few lines before writeSectionContent() is called.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I posted this comment in the wrong place by accident, so it might be confusing things a little. I actually was referring to the switch a bit above, not specifically the error message. In other words, showing that if ShAddrAlign: 4 is specified but AddressAlign: 8 is specified, the 8 is used for padding purposes. In other words, catching the problem I pointed out earlier and that got changed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, overrideFields() is called after writeSectionContent(), so even if there are different values in ShAddrAlign and AddressAlign properties, writeSectionContent() will still see the same value in Section.AddressAlign and SHeader.sh_addralign, taken from AddressAlign. Nevertheless, I've added a new test for this scenario. It even shows that you can generate a note section with an invalid alignment without triggering an error, as long as an expected alignment is specified in the AddressAlign property.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - up to you whether you add the test case I'm talking about or not.

@@ -1810,7 +1810,7 @@ void ELFState<ELFT>::writeSectionContent(Elf_Shdr &SHeader,
break;
default:
reportError(Section.Name + ": invalid alignment for a note section: 0x" +
Twine::utohexstr(SHeader.sh_addralign));
Twine::utohexstr(Section.AddressAlign));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I posted this comment in the wrong place by accident, so it might be confusing things a little. I actually was referring to the switch a bit above, not specifically the error message. In other words, showing that if ShAddrAlign: 4 is specified but AddressAlign: 8 is specified, the 8 is used for padding purposes. In other words, catching the problem I pointed out earlier and that got changed.

@igorkudrin igorkudrin merged commit 740ac4f into llvm:main Dec 5, 2024
8 checks passed
@igorkudrin igorkudrin deleted the yaml2obj-notes-alignment-fix branch December 5, 2024 05:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants