Skip to content

[LLD][COFF] Ensure .bss is merged at the end of a section. #137677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 30, 2025

Conversation

jeremyd2019
Copy link
Contributor

Because it is full of zeros, it is expected that as much of it as possible is elided from the actual image, and that cannot happen if there is initialized data in the section after it.

@jeremyd2019 jeremyd2019 force-pushed the lld-merge-bss-at-end-of-data branch from ae66955 to 6cefe38 Compare April 28, 2025 18:15
@jeremyd2019 jeremyd2019 changed the title [LLD][COFF] Merge .bss at end of .data section. [LLD][COFF] Ensure .bss is merged at the end of a section. Apr 28, 2025
@jeremyd2019 jeremyd2019 marked this pull request as ready for review April 28, 2025 18:19
@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-platform-windows
@llvm/pr-subscribers-lld-coff

@llvm/pr-subscribers-lld

Author: None (jeremyd2019)

Changes

Because it is full of zeros, it is expected that as much of it as possible is elided from the actual image, and that cannot happen if there is initialized data in the section after it.


Full diff: https://github.com/llvm/llvm-project/pull/137677.diff

2 Files Affected:

  • (modified) lld/COFF/Writer.cpp (+31-21)
  • (added) lld/test/COFF/merge-data-bss.test (+92)
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index a5582cc8074d1..00e63cb90c2d0 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -215,6 +215,7 @@ class Writer {
   void appendImportThunks();
   void locateImportTables();
   void createExportTable();
+  void mergeSection(const std::map<StringRef, StringRef>::value_type &p);
   void mergeSections();
   void sortECChunks();
   void appendECImportTables();
@@ -1566,6 +1567,30 @@ void Writer::createSymbolAndStringTable() {
   fileSize = alignTo(fileOff, ctx.config.fileAlign);
 }
 
+void Writer::mergeSection(const std::map<StringRef, StringRef>::value_type &p) {
+  StringRef toName = p.second;
+  if (p.first == toName)
+    return;
+  StringSet<> names;
+  while (true) {
+    if (!names.insert(toName).second)
+      Fatal(ctx) << "/merge: cycle found for section '" << p.first << "'";
+    auto i = ctx.config.merge.find(toName);
+    if (i == ctx.config.merge.end())
+      break;
+    toName = i->second;
+  }
+  OutputSection *from = findSection(p.first);
+  OutputSection *to = findSection(toName);
+  if (!from)
+    return;
+  if (!to) {
+    from->name = toName;
+    return;
+  }
+  to->merge(from);
+}
+
 void Writer::mergeSections() {
   llvm::TimeTraceScope timeScope("Merge sections");
   if (!pdataSec->chunks.empty()) {
@@ -1594,28 +1619,13 @@ void Writer::mergeSections() {
   }
 
   for (auto &p : ctx.config.merge) {
-    StringRef toName = p.second;
-    if (p.first == toName)
-      continue;
-    StringSet<> names;
-    while (true) {
-      if (!names.insert(toName).second)
-        Fatal(ctx) << "/merge: cycle found for section '" << p.first << "'";
-      auto i = ctx.config.merge.find(toName);
-      if (i == ctx.config.merge.end())
-        break;
-      toName = i->second;
-    }
-    OutputSection *from = findSection(p.first);
-    OutputSection *to = findSection(toName);
-    if (!from)
-      continue;
-    if (!to) {
-      from->name = toName;
-      continue;
-    }
-    to->merge(from);
+    if (p.first != ".bss")
+      mergeSection(p);
   }
+
+  auto it = ctx.config.merge.find(".bss");
+  if (it != ctx.config.merge.end())
+    mergeSection(*it);
 }
 
 // EC targets may have chunks of various architectures mixed together at this
diff --git a/lld/test/COFF/merge-data-bss.test b/lld/test/COFF/merge-data-bss.test
new file mode 100644
index 0000000000000..a821f8d6e9048
--- /dev/null
+++ b/lld/test/COFF/merge-data-bss.test
@@ -0,0 +1,92 @@
+# RUN: yaml2obj %s -o %t.obj
+# RUN: lld-link /out:%t.exe /entry:main /subsystem:console /force \
+# RUN:   /merge:.other=.data %t.obj /debug
+# RUN: llvm-readobj --sections %t.exe | FileCheck %s
+
+# CHECK:      Name: .data
+# CHECK-NEXT: VirtualSize: 0x2018
+# CHECK-NEXT: VirtualAddress: 0x3000
+# CHECK-NEXT: RawDataSize: 512
+# CHECK-NOT: Name: .other
+
+--- !COFF
+header:
+  Machine:         IMAGE_FILE_MACHINE_AMD64
+  Characteristics: [  ]
+sections:
+  - Name:            .text
+    Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
+    Alignment:       4
+    SectionData:     '90'
+    SizeOfRawData:   1
+  - Name:            .data
+    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
+    Alignment:       4
+    SectionData:     '010000000000000002'
+    SizeOfRawData:   9
+  - Name:            .bss
+    Characteristics: [ IMAGE_SCN_CNT_UNINITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
+    Alignment:       4
+    SectionData:     ''
+    SizeOfRawData:   8192
+  - Name:            .other
+    Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ, IMAGE_SCN_MEM_WRITE ]
+    Alignment:       4
+    SectionData:     '030000000000000004'
+    SizeOfRawData:   9
+symbols:
+  - Name:            .text
+    Value:           0
+    SectionNumber:   1
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          1
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        4027552580
+      Number:          1
+  - Name:            .data
+    Value:           0
+    SectionNumber:   2
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          9
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        4185224559
+      Number:          2
+  - Name:            .bss
+    Value:           0
+    SectionNumber:   3
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          8192
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        0
+      Number:          3
+  - Name:            .other
+    Value:           0
+    SectionNumber:   4
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_STATIC
+    SectionDefinition:
+      Length:          9
+      NumberOfRelocations: 0
+      NumberOfLinenumbers: 0
+      CheckSum:        1054931164
+      Number:          4
+  - Name:            main
+    Value:           0
+    SectionNumber:   1
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_NULL
+    StorageClass:    IMAGE_SYM_CLASS_EXTERNAL
+...

@jeremyd2019
Copy link
Contributor Author

Split from #136180 as requested by @mstorsjo

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, with just one question just to be sure, and one request for an added comment.

After approving, I'd wait a day or two here in case someone else might want to speak up about this (e.g. with suggestions about other ways of doing it?), but this looks acceptable to me.

Thanks!

@@ -1566,6 +1567,30 @@ void Writer::createSymbolAndStringTable() {
fileSize = alignTo(fileOff, ctx.config.fileAlign);
}

void Writer::mergeSection(const std::map<StringRef, StringRef>::value_type &p) {
StringRef toName = p.second;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this factorized function is just a direct move of the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, with indents reduced and continues turned into returns.

Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
@jeremyd2019 jeremyd2019 force-pushed the lld-merge-bss-at-end-of-data branch from 6cefe38 to 48dcc96 Compare April 28, 2025 21:35
Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mstorsjo mstorsjo merged commit bb2f759 into llvm:main Apr 30, 2025
11 checks passed
@jeremyd2019 jeremyd2019 deleted the lld-merge-bss-at-end-of-data branch April 30, 2025 17:01
jeremyd2019 added a commit to jeremyd2019/llvm-project that referenced this pull request May 2, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
jeremyd2019 added a commit to jeremyd2019/llvm-project that referenced this pull request May 2, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
jeremyd2019 added a commit to jeremyd2019/llvm-project that referenced this pull request May 3, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
jeremyd2019 added a commit to jeremyd2019/llvm-project that referenced this pull request May 5, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
jeremyd2019 added a commit to jeremyd2019/llvm-project that referenced this pull request May 14, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
jeremyd2019 added a commit to jeremyd2019/llvm-project that referenced this pull request May 15, 2025
Because it is full of zeros, it is expected that as much of it as
possible is elided from the actual image, and that cannot happen if
there is initialized data in the section after it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants