Skip to content

[ELF] Orphan placement: remove hasInputSections condition #93761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented May 30, 2024

https://reviews.llvm.org/D60131 (Change default output section type to
SHT_PROGBITS) caused a orphan placement regression for Fuchsia
zircon.elf: #40998 The orphan section code_patch_table was placed
before the first output section description .text.boot0, breaking the
address requirement.

https://reviews.llvm.org/D61197 (Fix getRankProximity to "ignore" not
live sections) fixed the regression by adding a Live condition (which
later became hasInputSections).

This condition added complexity, which turns out to be unneeded after

The new orphan placement rule is slightly different (orphans can be
placed after an output section that does not contain input sections),
but simpler and probably more reasonable.

  • nobits-offset.s: .sec1 (NOLOAD) : { . += 1; }
  • symbol-only-align.test: foo without an input section gets the flags A.
    The orphan .dynsym can now be placed after it.
    (Due to PHDRS, .dynsym with a lower rank must be placed after the anchor.)
  • tls-nobits-offset.s: the orphan .text has a larger rank than the
    anchor (.sec1)'s and is placed after the anchor.

memory-nonalloc-no-warn.test would need a change if we don't apply
#94519.

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented May 30, 2024

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Fangrui Song (MaskRay)

Changes

https://reviews.llvm.org/D60131 (Change default output section type to
SHT_PROGBITS) caused a orphan placement regression for Fuchsia
zircon.elf: #40998 The orphan section code_patch_table was placed
before the first output section description .text.boot0, breaking the
address requirement.

https://reviews.llvm.org/D61197 (Fix getRankProximity to "ignore" not
live sections) fixed the regression by adding a Live condition (which
later became hasInputSections).

This condition added complexity, which turns out to be unneeded after
3bdc90e and
f639b57. The new orphan placement rule
is slightly different (orphans can be placed after an output section
that does not contain input sections), but simpler and probably more
reasonable.

  • nobits-offset.s: .sec1 (NOLOAD) : { . += 1; }
  • orphan-live-only.s: .pad : { QUAD(0); } :exec
  • memory-nonalloc-no-warn.test: .nonalloc, .dat, and .intvec[012]_out have the same rank.
    .comment was previously placed after .nonalloc and now after .intvec2_out.

Full diff: https://github.com/llvm/llvm-project/pull/93761.diff

6 Files Affected:

  • (modified) lld/ELF/Writer.cpp (+6-12)
  • (modified) lld/test/ELF/linkerscript/memory-nonalloc-no-warn.test (+9-10)
  • (modified) lld/test/ELF/linkerscript/nobits-offset.s (+1-1)
  • (modified) lld/test/ELF/linkerscript/orphan-live-only.s (+3-3)
  • (modified) lld/test/ELF/linkerscript/symbol-only-align.test (+6-5)
  • (modified) lld/test/ELF/linkerscript/tls-nobits-offset.s (+1-1)
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index c498153f3348b..4add995c93458 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -890,9 +890,7 @@ template <class ELFT> void Writer<ELFT>::setReservedSymbolSections() {
 // countLeadingZeros.
 static int getRankProximity(OutputSection *a, SectionCommand *b) {
   auto *osd = dyn_cast<OutputDesc>(b);
-  return (osd && osd->osec.hasInputSections)
-             ? llvm::countl_zero(a->sortRank ^ osd->osec.sortRank)
-             : -1;
+  return osd ? llvm::countl_zero(a->sortRank ^ osd->osec.sortRank) : -1;
 }
 
 // When placing orphan sections, we want to place them after symbol assignments
@@ -958,20 +956,16 @@ findOrphanPos(SmallVectorImpl<SectionCommand *>::iterator b,
     sortRank = std::max(sortRank, foundSec->sortRank);
   for (; i != e; ++i) {
     auto *curSecDesc = dyn_cast<OutputDesc>(*i);
-    if (!curSecDesc || !curSecDesc->osec.hasInputSections)
+    if (!curSecDesc)
       continue;
     if (getRankProximity(sec, curSecDesc) != proximity ||
         sortRank < curSecDesc->osec.sortRank)
       break;
   }
 
-  auto isOutputSecWithInputSections = [](SectionCommand *cmd) {
-    auto *osd = dyn_cast<OutputDesc>(cmd);
-    return osd && osd->osec.hasInputSections;
-  };
-  auto j =
-      std::find_if(std::make_reverse_iterator(i), std::make_reverse_iterator(b),
-                   isOutputSecWithInputSections);
+  auto isOutputSec = [](SectionCommand *cmd) { return isa<OutputDesc>(cmd); };
+  auto j = std::find_if(std::make_reverse_iterator(i),
+                        std::make_reverse_iterator(b), isOutputSec);
   i = j.base();
 
   // As a special case, if the orphan section is the last section, put
@@ -979,7 +973,7 @@ findOrphanPos(SmallVectorImpl<SectionCommand *>::iterator b,
   // This matches bfd's behavior and is convenient when the linker script fully
   // specifies the start of the file, but doesn't care about the end (the non
   // alloc sections for example).
-  auto nextSec = std::find_if(i, e, isOutputSecWithInputSections);
+  auto nextSec = std::find_if(i, e, isOutputSec);
   if (nextSec == e)
     return e;
 
diff --git a/lld/test/ELF/linkerscript/memory-nonalloc-no-warn.test b/lld/test/ELF/linkerscript/memory-nonalloc-no-warn.test
index 2dcd0f8d6ce2f..9c6111008c818 100644
--- a/lld/test/ELF/linkerscript/memory-nonalloc-no-warn.test
+++ b/lld/test/ELF/linkerscript/memory-nonalloc-no-warn.test
@@ -16,22 +16,21 @@
 ## The output file must include all sections.
 # RUN: llvm-readelf -S %t/a.elf | FileCheck %s
 
-# CHECK:      There are 12 section headers, starting at offset 0x2140:
+# CHECK:      There are 12 section headers, starting at offset 0x2138:
 # CHECK:      [Nr] Name         Type      Address          Off    Size   ES Flg Lk Inf Al
 # CHECK-NEXT: [ 0]              NULL      0000000000000000 000000 000000 00     0  0   0
 # CHECK-NEXT: [ 1] .nonalloc    PROGBITS  0000000000000000 001064 001000 00 W   0  0   1
-# CHECK-NEXT: [ 2] .comment     PROGBITS  0000000000000000 {{.*}} {{.*}} 01 MS  0  0   1
-# CHECK-NEXT: [ 3] .symtab      SYMTAB    0000000000000000 {{.*}} {{.*}} 18     5  1   8
-# CHECK-NEXT: [ 4] .shstrtab    STRTAB    0000000000000000 {{.*}} {{.*}} 00     0  0   1
-# CHECK-NEXT: [ 5] .strtab      STRTAB    0000000000000000 {{.*}} {{.*}} 00     0  0   1
-# CHECK-NEXT: [ 6] .dat         PROGBITS  0000000000000000 002137 000004 00 W   0  0   1
-# CHECK-NEXT: [ 7] .intvec0_out PROGBITS  0000000000000000 00213b 000000 00 W   0  0   1
-# CHECK-NEXT: [ 8] .intvec1_out PROGBITS  0000000000000000 00213b 000000 00 W   0  0   1
-# CHECK-NEXT: [ 9] .intvec2_out PROGBITS  0000000000000000 00213b 000000 00 W   0  0   1
+# CHECK-NEXT: [ 2] .dat         PROGBITS  0000000000000000 002064 000004 00 W   0  0   1
+# CHECK-NEXT: [ 3] .intvec0_out PROGBITS  0000000000000000 002068 000000 00 W   0  0   1
+# CHECK-NEXT: [ 4] .intvec1_out PROGBITS  0000000000000000 002068 000000 00 W   0  0   1
+# CHECK-NEXT: [ 5] .intvec2_out PROGBITS  0000000000000000 002068 000000 00 W   0  0   1
+# CHECK-NEXT: [ 6] .comment     PROGBITS  0000000000000000 {{.*}} {{.*}} 01 MS  0  0   1
+# CHECK-NEXT: [ 7] .symtab      SYMTAB    0000000000000000 {{.*}} {{.*}} 18     9  1   8
+# CHECK-NEXT: [ 8] .shstrtab    STRTAB    0000000000000000 {{.*}} {{.*}} 00     0  0   1
+# CHECK-NEXT: [ 9] .strtab      STRTAB    0000000000000000 {{.*}} {{.*}} 00     0  0   1
 # CHECK-NEXT: [10] .intvec3_out PROGBITS  00000000803fe060 001060 000004 00 AX  0  0   1
 # CHECK-NEXT: [11] .text        PROGBITS  00000000803fe064 001064 000000 00 AX  0  0   4
 
-
 #--- a.s
 .global _start
 .text
diff --git a/lld/test/ELF/linkerscript/nobits-offset.s b/lld/test/ELF/linkerscript/nobits-offset.s
index 387eaed4cb146..b305cec6fb699 100644
--- a/lld/test/ELF/linkerscript/nobits-offset.s
+++ b/lld/test/ELF/linkerscript/nobits-offset.s
@@ -14,8 +14,8 @@
 
 # CHECK:      Name  Type     Address          Off     Size
 # CHECK-NEXT:       NULL     0000000000000000 000000  000000
-# CHECK-NEXT: .text PROGBITS 0000000000000000 000190  000000
 # CHECK-NEXT: .sec1 NOBITS   0000000000000000 001000  000001
+# CHECK-NEXT: .text PROGBITS 0000000000000004 001000  000000
 # CHECK-NEXT: .bss  NOBITS   0000000000000400 001400  000001
 
 # CHECK:      Type Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
diff --git a/lld/test/ELF/linkerscript/orphan-live-only.s b/lld/test/ELF/linkerscript/orphan-live-only.s
index 500005173a866..900532e0eab61 100644
--- a/lld/test/ELF/linkerscript/orphan-live-only.s
+++ b/lld/test/ELF/linkerscript/orphan-live-only.s
@@ -20,14 +20,14 @@
 
 # CHECK: Section Headers
 # CHECK: .pad
+# CHECK-NEXT: .orphan1
 # CHECK-NEXT: .text
 # CHECK-NEXT: .orphan2
 # CHECK-NEXT: .ro
-# CHECK-NEXT: .orphan1
 
 # CHECK: Segment Sections
-# CHECK-NEXT: .pad .text .orphan2
-# CHECK-NEXT: .ro .orphan1
+# CHECK-NEXT: .pad .orphan1 .text .orphan2
+# CHECK-NEXT: .ro
 
 .section .text,"ax"
  ret
diff --git a/lld/test/ELF/linkerscript/symbol-only-align.test b/lld/test/ELF/linkerscript/symbol-only-align.test
index f4ecf275ec0bd..c553e8138b5eb 100644
--- a/lld/test/ELF/linkerscript/symbol-only-align.test
+++ b/lld/test/ELF/linkerscript/symbol-only-align.test
@@ -23,13 +23,14 @@ SECTIONS {
 ## offset and addresses match.
 
 # CHECK: Section Headers
-# CHECK:      foo   PROGBITS 0000000000[[ADDR:[0-9a-f]*]] [[ADDR]]
-# CHECK-NEXT: .data PROGBITS 0000000000[[ADDR]] [[ADDR]]
+# CHECK:      .text     PROGBITS
+# CHECK-NEXT: foo       PROGBITS 0000000000001000 001000 000000
+# CHECK-NEXT: .dynsym   DYNSYM   0000000000001000 001000
 
 # CHECK: Program Headers
 # CHECK:      LOAD
-# CHECK-NEXT: LOAD 0x[[ADDR]] 0x0000000000[[ADDR]] 0x0000000000[[ADDR]]
+# CHECK-NEXT: LOAD 0x001000   0x0000000000001000   0x0000000000001000
 
 # CHECK: Symbol table
-# CHECK: 0000000000[[ADDR]] 0 NOTYPE GLOBAL DEFAULT {{[0-9]+}} __start_foo
-# CHECK: 0000000000[[ADDR]] 0 NOTYPE GLOBAL DEFAULT {{[0-9]+}} __end_foo
+# CHECK: 0000000000001000   0 NOTYPE GLOBAL DEFAULT [[#]] __start_foo
+# CHECK: 0000000000001000   0 NOTYPE GLOBAL DEFAULT [[#]] __end_foo
diff --git a/lld/test/ELF/linkerscript/tls-nobits-offset.s b/lld/test/ELF/linkerscript/tls-nobits-offset.s
index 9aab32317be4c..7830b457e1b4d 100644
--- a/lld/test/ELF/linkerscript/tls-nobits-offset.s
+++ b/lld/test/ELF/linkerscript/tls-nobits-offset.s
@@ -14,8 +14,8 @@
 
 # CHECK:      Name  Type     Address          Off     Size
 # CHECK-NEXT:       NULL     0000000000000000 000000  000000
-# CHECK-NEXT: .text PROGBITS 0000000000000000 000190  000000
 # CHECK-NEXT: .sec1 PROGBITS 0000000000000000 001000  000001
+# CHECK-NEXT: .text PROGBITS 0000000000000004 001004  000000
 # CHECK-NEXT: .tbss NOBITS   0000000000000400 001400  000001
 
 # CHECK:      Type Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align

@MaskRay MaskRay requested review from igorkudrin, nga888 and smithp35 May 30, 2024 03:14
@MaskRay
Copy link
Member Author

MaskRay commented May 30, 2024

The description contains a lot of links to previous patches, but they are not strictly needed to understand this patch. In short, the previous && Sec->Live condition is unneeded.

Copy link
Collaborator

@igorkudrin igorkudrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the linker now takes into account sections defined in a linker script when placing orphans, right? What if some kind of padding/filler section is defined in the script? It looks like it will attract orphan data sections when placed before actual data sections. Do you think a new RankFlag can be added to make real sections more preferable?

@MaskRay
Copy link
Member Author

MaskRay commented May 30, 2024

Basically, the linker now takes into account sections defined in a linker script when placing orphans, right?

Yes.

What if some kind of padding/filler section is defined in the script? It looks like it will attract orphan data sections when placed before actual data sections.

We rely on SHF_ALLOC | SHF_WRITE inference for output sections without an input section.

// adjustOutputSections
    if (isEmpty) {
      sec->flags =
          flags & ((sec->nonAlloc ? 0 : (uint64_t)SHF_ALLOC) | SHF_WRITE);
      sec->sortRank = getSectionRank(*sec);
    }

Do you think a new RankFlag can be added to make real sections more preferable?

Orphan placement seems primarily focused on choosing a suitable region (R/RX/RW/non-ALLOC).
findOrphanPos appears to be effective in achieving this.

A RankFlag might allow for more precise placement within the chosen region, but I suspect it may not be necessary.
That said, I think this is a quite good suggestion and worth exploring if we run into a case where such a granularity is needed.
Out orphan placement rule is quite complex. Without further evidence I want to hold off adding a new RankFlag.

Comment on lines 29 to 30
# CHECK-NEXT: .pad .text .orphan2
# CHECK-NEXT: .ro .orphan1
# CHECK-NEXT: .pad .orphan1 .text .orphan2
# CHECK-NEXT: .ro
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change of placement of .orphan1 does not look correct given that its flags are "a".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, I think it was this behaviour that was the reason for the Live patch from https://reviews.llvm.org/D61197, i.e. to ignore .pad when placing orphans as stated by the comment on line 15 in this test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only just read @igorkudrin's comment which basically explains why the above has changed. If a new RankFlag is preferable to retaining the use of hasInputSections() then +1 from me. The key thing from our point of view is that the behaviour in this test, orphan-live-only.s is maintained.

Copy link
Member Author

@MaskRay MaskRay May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w/o this patch; .orphan1 is after .ro (the second A, but the first A that contains input sections).

  [ 1] .pad              PROGBITS        0000000000000000 001000 000008 00   A  0   0  1
  [ 2] .text             PROGBITS        0000000000000008 001008 000001 00  AX  0   0  4
  [ 3] .orphan2          PROGBITS        0000000000000009 001009 000001 00  AX  0   0  1
  [ 4] .ro               PROGBITS        000000000000000a 00100a 000001 00   A  0   0  1
  [ 5] .orphan1          PROGBITS        000000000000000b 00100b 000001 00   A  0   0  1

w/ this patch; .orphan1 is after .ro (the first A).

  [ 1] .pad              PROGBITS        0000000000000000 001000 000008 00   A  0   0  1
  [ 2] .orphan1          PROGBITS        0000000000000008 001008 000001 00   A  0   0  1
  [ 3] .text             PROGBITS        000000000000000c 00100c 000001 00  AX  0   0  4
  [ 4] .orphan2          PROGBITS        000000000000000d 00100d 000001 00  AX  0   0  1
  [ 5] .ro               PROGBITS        000000000000000e 00100e 000001 00   A  0   0  1

I think the new behavior is more reasonable.

The key thing from our point of view is that the behaviour in this test, orphan-live-only.s is maintained.

orphan-live-only.s might be overly reduced. Is there a better justification for the skip-padding-section behavior?

Currently output sections without an input section rely heavily on the flags propagation rule.
With the introduction of osec (READONLY) : { } and possible future addition for SHF_EXECINSTR, I think these sections should be used as an anchor as well.

For the reasonable use cases, WA pad WA pad WA, w/o or w/ this patch, a new WA section will be placed at the end of the consecutive WA/pad sequence.
The same applies to A pad A pad A.

Therefore, I don't think a new RankFlag is needed.

@smithp35
Copy link
Collaborator

Will take a look next week if it hasn't already been approved. Apologies out of the office till the end of the week.

@igorkudrin
Copy link
Collaborator

Basically, the linker now takes into account sections defined in a linker script when placing orphans, right?
Yes.

My concerns are for cases like:

> cat test.s
.section ".text1","ax","progbits"; nop;
.section ".text2","ax","progbits"; nop;
.section ".rodata","a","progbits"; .long 0;
.section ".orphan","a","progbits"; .long 0;
> cat test.t
PHDRS { text PT_LOAD; rodata PT_LOAD; }
SECTIONS {
  .text1 : { *(.text1) } : text
  #.other : { foo = .; *(.other) }
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
  .rodata : { *(.rodata) } : rodata
}
> llvm-mc -filetype=obj test.s -o test.o
> ld.lld -shared test.o -T test.t -o test.so
> llvm-readelf -l test.so
...
  LOAD           0x001000 0x0000000000000000 0x0000000000000000 0x000056 0x000056 R E 0x1000
  LOAD           0x001056 0x0000000000000056 0x0000000000000056 0x00007a 0x00007a RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text1 .text .other .orphan .dynsym .gnu.hash .hash .dynstr .text2
   01     .rodata .dynamic
...

Of course, this example is oversimplified, but I see something similar in our projects.

For this sample, GNU linkers and current lld place .orphan and the synthetic sections after .rodata, while the patched lld puts them in the middle of an executable segment. The problem is that output sections without input sections are usually unaware of their flags, as they normally inherit them from their input sections. With the new placement rules, such faint sections are used as anchors for orphans, moving them to unsuitable locations.

MaskRay added 2 commits May 31, 2024 22:37
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
Copy link

github-actions bot commented Jun 1, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 1, 2024

Will take a look next week if it hasn't already been approved. Apologies out of the office till the end of the week.

Thanks. Take your time!

Basically, the linker now takes into account sections defined in a linker script when placing orphans, right?
Yes.

My concerns are for cases like:

> cat test.s
.section ".text1","ax","progbits"; nop;
.section ".text2","ax","progbits"; nop;
.section ".rodata","a","progbits"; .long 0;
.section ".orphan","a","progbits"; .long 0;
> cat test.t
PHDRS { text PT_LOAD; rodata PT_LOAD; }
SECTIONS {
  .text1 : { *(.text1) } : text
  #.other : { foo = .; *(.other) }
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
  .rodata : { *(.rodata) } : rodata
}
> llvm-mc -filetype=obj test.s -o test.o
> ld.lld -shared test.o -T test.t -o test.so
> llvm-readelf -l test.so
...
  LOAD           0x001000 0x0000000000000000 0x0000000000000000 0x000056 0x000056 R E 0x1000
  LOAD           0x001056 0x0000000000000056 0x0000000000000056 0x00007a 0x00007a RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text1 .text .other .orphan .dynsym .gnu.hash .hash .dynstr .text2
   01     .rodata .dynamic
...

Of course, this example is oversimplified, but I see something similar in our projects.

For this sample, GNU linkers and current lld place .orphan and the synthetic sections after .rodata, while the patched lld puts them in the middle of an executable segment. The problem is that output sections without input sections are usually unaware of their flags, as they normally inherit them from their input sections. With the new placement rules, such faint sections are used as anchors for orphans, moving them to unsuitable locations.

Thanks for the example. This and #92987 convince me that we should do #94099 (when there are multiple most similar sections, select the last one if its rank <= orphan's rank).

With #94099, this patch will not cause a test change and your .orphan example will have no behavior change.

I think this patch is still controversial. #94099 is probably a clear improvement.

Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented Jun 4, 2024

Will take a look next week if it hasn't already been approved. Apologies out of the office till the end of the week.

Thanks. Take your time!

Basically, the linker now takes into account sections defined in a linker script when placing orphans, right?
Yes.

My concerns are for cases like:

> cat test.s
.section ".text1","ax","progbits"; nop;
.section ".text2","ax","progbits"; nop;
.section ".rodata","a","progbits"; .long 0;
.section ".orphan","a","progbits"; .long 0;
> cat test.t
PHDRS { text PT_LOAD; rodata PT_LOAD; }
SECTIONS {
  .text1 : { *(.text1) } : text
  #.other : { foo = .; *(.other) }
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
  .rodata : { *(.rodata) } : rodata
}
> llvm-mc -filetype=obj test.s -o test.o
> ld.lld -shared test.o -T test.t -o test.so
> llvm-readelf -l test.so
...
  LOAD           0x001000 0x0000000000000000 0x0000000000000000 0x000056 0x000056 R E 0x1000
  LOAD           0x001056 0x0000000000000056 0x0000000000000056 0x00007a 0x00007a RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text1 .text .other .orphan .dynsym .gnu.hash .hash .dynstr .text2
   01     .rodata .dynamic
...

Of course, this example is oversimplified, but I see something similar in our projects.
For this sample, GNU linkers and current lld place .orphan and the synthetic sections after .rodata, while the patched lld puts them in the middle of an executable segment. The problem is that output sections without input sections are usually unaware of their flags, as they normally inherit them from their input sections. With the new placement rules, such faint sections are used as anchors for orphans, moving them to unsuitable locations.

Thanks for the example. This and #92987 convince me that we should do #94099 (when there are multiple most similar sections, select the last one if its rank <= orphan's rank).

With #94099, this patch will not cause a test change and your .orphan example will have no behavior change.

I think this patch is still controversial. #94099 is probably a clear improvement.

Rebased after #94099

I think this simplification is applicable.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that llvm-project/lld/test/ELF/linkerscript/orphan-live-only.s failed after applying this patch. It looks like this was mentioned in an earlier comment, but it doesn't look like that change has made it into the tests. I could be doing something wrong, so please ignore me if I have.

I agree that the change to orphan-live-only.s is reasonable.

I've also run the bin/ld.lld -shared test.o -T test.t -o test.so from an earlier comment and that also gives a good result with the change

I think the non-alloc synthetic section placement could do with some improvement but I think that would have to be done with a separate patch.

# CHECK-NEXT: [ 3] .intvec0_out PROGBITS 0000000000000000 002068 000000 00 W 0 0 1
# CHECK-NEXT: [ 4] .intvec1_out PROGBITS 0000000000000000 002068 000000 00 W 0 0 1
# CHECK-NEXT: [ 5] .intvec2_out PROGBITS 0000000000000000 002068 000000 00 W 0 0 1
# CHECK-NEXT: [ 6] .comment PROGBITS 0000000000000000 {{.*}} {{.*}} 01 MS 0 0 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think we haven't got the rank order quite right, or we need a special case for non-alloc (possibly even synthetic non-alloc) sections so that these go at the end.

For example GNU ld (which removes intvec0_out to intvec2_out) gives:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .nonalloc         PROGBITS        0000000000000000 001064 001000 00   W  0   0  1
  [ 2] .dat              PROGBITS        0000000010000000 001000 000004 00   A  0   0  1
  [ 3] .intvec3_out      PROGBITS        00000000803fe060 001060 000004 00  AX  0   0  1
  [ 4] .symtab           SYMTAB          0000000000000000 002068 000048 18      5   1  8
  [ 5] .strtab           STRTAB          0000000000000000 0020b0 000012 00      0   0  1
  [ 6] .shstrtab         STRTAB          0000000000000000 0020c2 000037 00      0   0  1

I think it is better than it was before though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I was thinking of a special case at the while (nonScriptI != e) (near Writer.cpp:1313) loop that we keep non-SHF_ALLOC orphan at the end.

It is probably still worth doing.

@@ -16,22 +16,21 @@
## The output file must include all sections.
# RUN: llvm-readelf -S %t/a.elf | FileCheck %s

# CHECK: There are 12 section headers, starting at offset 0x2140:
# CHECK: There are 12 section headers, starting at offset 0x2138:
# CHECK: [Nr] Name Type Address Off Size ES Flg Lk Inf Al
# CHECK-NEXT: [ 0] NULL 0000000000000000 000000 000000 00 0 0 0
# CHECK-NEXT: [ 1] .nonalloc PROGBITS 0000000000000000 001064 001000 00 W 0 0 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on the line, this is a forward reference. In the linker script

.nonalloc : { *(.nonalloc) } > MEM

MEM does not exist, only MEM and MEM1. GNU ld gives a warning in this case

ld.bfd:/data_nvme0n1/work/llvm-project/build/tools/lld/test/ELF/linkerscript/Output/memory-nonalloc-no-warn.test.tmp/a.lds:10: warning: memory region `MEM' not declared

As a noalloc section MEM may have been intentional, but my guess is that it was a typo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference in section header table offsets is due to less alignment padding by coincidence.


.nonalloc : { *(.nonalloc) } > MEM

lld reports warning: ignoring memory region assignment for non-allocatable section '.nonalloc'.

lld also supports an error (e.g. error: memory region 'ram' not declared for memory-err.s) but its precedence is lower than the non-allocatable warning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #93761 , this patch will not need to modify this test file, but GitHub PR doesn't seem to make it clear.

Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented Jun 5, 2024

I found that llvm-project/lld/test/ELF/linkerscript/orphan-live-only.s failed after applying this patch. It looks like this was mentioned in an earlier comment, but it doesn't look like that change has made it into the tests. I could be doing something wrong, so please ignore me if I have.

I agree that the change to orphan-live-only.s is reasonable.

orphan-live-only.s would fail if #94099 is not applied.

The description mentioned orphan-live-only.s. The test is not unmodified. Corrected the description.

I've also run the bin/ld.lld -shared test.o -T test.t -o test.so from an earlier comment and that also gives a good result with the change

I think the non-alloc synthetic section placement could do with some improvement but I think that would have to be done with a separate patch.

Perhaps we do need a !SHF_ALLOC orphan => at the end special case.

MaskRay added 2 commits June 5, 2024 11:36
Created using spr 1.3.5-bogner

[skip ci]
Created using spr 1.3.5-bogner
@MaskRay MaskRay changed the base branch from main to users/MaskRay/spr/main.elf-orphan-placement-remove-hasinputsections-condition June 5, 2024 18:36
@smithp35
Copy link
Collaborator

smithp35 commented Jun 6, 2024

I think #94519 looks good. Probably worth landing that first and rebasing this one.

MaskRay added a commit that referenced this pull request Jun 6, 2024
https://reviews.llvm.org/D85867 changed the way we assign file offsets
(alloc sections first, then non-alloc sections).

It also removed a non-alloc special case from `findOrphanPos`.
Looking at the memory-nonalloc-no-warn.test change, which would be
needed by #93761, it makes sense to restore the previous behavior: when
placing non-alloc orphan sections, keep these sections at the end so
that the section index order matches the file offset order.

This change is cosmetic. In sections-nonalloc.s, GNU ld places the
orphan `other3` in the middle and the orphan .symtab/.shstrtab/.strtab
at the end.

Pull Request: #94519
MaskRay added 2 commits June 6, 2024 13:23
Created using spr 1.3.5-bogner

[skip ci]
Created using spr 1.3.5-bogner
@MaskRay MaskRay changed the base branch from users/MaskRay/spr/main.elf-orphan-placement-remove-hasinputsections-condition to main June 6, 2024 20:24
Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented Jun 7, 2024

Rebased.

Since output sections with hasInputSections==false always have A or WA flags.
The "similar section's rank <= orphan's rank" condition almost always kicks in, with a few exceptions: ambiguous cases that we don't particularly care about. These test updates are documented in the description.

@igorkudrin
Copy link
Collaborator

Since output sections with hasInputSections==false always have A or WA flags. The "similar section's rank <= orphan's rank" condition almost always kicks in, with a few exceptions: ambiguous cases that we don't particularly care about. These test updates are documented in the description.

Maybe I'm missing something, but the latest revision of this patch with my example still places synthetic sections in an executable segment. Prioritizing the last section of the same similarity (#94099) does little to address the core problem I was trying to demonstrate: output sections without input sections have no reliable flags, and thus when used as potential anchors for orphan sections, the output can be unexpected; another example can easily be conjured up where such sections occur in the last segment, attracting orphans there. So for me, this patch favors simplification over correctness, and I would stick with the latter, especially since the proposed simplification does not seem critical.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 9, 2024

Since output sections with hasInputSections==false always have A or WA flags. The "similar section's rank <= orphan's rank" condition almost always kicks in, with a few exceptions: ambiguous cases that we don't particularly care about. These test updates are documented in the description.

Maybe I'm missing something, but the latest revision of this patch with my example still places synthetic sections in an executable segment. Prioritizing the last section of the same similarity (#94099) does little to address the core problem I was trying to demonstrate: output sections without input sections have no reliable flags, and thus when used as potential anchors for orphan sections, the output can be unexpected; another example can easily be conjured up where such sections occur in the last segment, attracting orphans there. So for me, this patch favors simplification over correctness, and I would stick with the latter, especially since the proposed simplification does not seem critical.

I think the new behavior is expected. Use the example at #93761 (comment).
Given

PHDRS { text PT_LOAD; rodata PT_LOAD; }
SECTIONS {
  .text1 : { *(.text1) } : text
  #.other : { foo = .; *(.other) }
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
  .rodata : { *(.rodata) } : rodata
}

the section layout after this patch is:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
  [ 3] .dynsym           DYNSYM          0000000000000008 001008 000018 18   A  6   1  8
  [ 4] .gnu.hash         GNU_HASH        0000000000000020 001020 00001c 00   A  3   0  8
  [ 5] .hash             HASH            000000000000003c 00103c 000010 04   A  3   0  4
  [ 6] .dynstr           STRTAB          000000000000004c 00104c 000001 00   A  0   0  1
  [ 7] .text2            PROGBITS        000000000000004d 00104d 000001 00  AX  0   0  1
  [ 8] .text             PROGBITS        0000000000000050 001050 000000 00  AX  0   0  4
  [ 9] .rodata           PROGBITS        0000000000000050 001050 000004 00   A  0   0  1
**** after .rodata
  [10] .orphan           PROGBITS        0000000000000054 001054 000004 00   A  0   0  1
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA  6   0  8
  [12] .comment          PROGBITS        0000000000000000 0010c8 000013 01  MS  0   0  1
  [13] .symtab           SYMTAB          0000000000000000 0010e0 000030 18     15   2  8
  [14] .shstrtab         STRTAB          0000000000000000 001110 000078 00      0   0  1
  [15] .strtab           STRTAB          0000000000000000 001188 00000a 00      0   0  1

If .rodata (A) is removed,

  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
***** after .other
  [ 3] .rodata           PROGBITS        0000000000000005 001005 000004 00   A  0   0  1
  [ 4] .orphan           PROGBITS        0000000000000009 001009 000004 00   A  0   0  1
  [ 5] .dynsym           DYNSYM          0000000000000010 001010 000018 18   A  8   1  8
  [ 6] .gnu.hash         GNU_HASH        0000000000000028 001028 00001c 00   A  5   0  8
  [ 7] .hash             HASH            0000000000000044 001044 000010 04   A  5   0  4
  [ 8] .dynstr           STRTAB          0000000000000054 001054 000001 00   A  0   0  1
  [ 9] .text2            PROGBITS        0000000000000055 001055 000001 00  AX  0   0  1
  [10] .text             PROGBITS        0000000000000058 001058 000000 00  AX  0   0  4
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA  8   0  8
  [12] .comment          PROGBITS        0000000000000000 0010c8 000013 01  MS  0   0  1
  [13] .symtab           SYMTAB          0000000000000000 0010e0 000030 18     15   2  8
  [14] .shstrtab         STRTAB          0000000000000000 001110 000078 00      0   0  1
  [15] .strtab           STRTAB          0000000000000000 001188 00000a 00      0   0  1

@igorkudrin
Copy link
Collaborator

I think the new behavior is expected. Use the example at #93761 (comment). Given

No other linkers put .dynsym and friends into an executable segment. Why do you find this acceptable?

Moreover, if we change the linker script, following the idea of lld to put read-only sections before executable ones:

PHDRS { rodata PT_LOAD; text PT_LOAD;}
SECTIONS {
  .rodata : { *(.rodata) } : rodata
  .text1 : { *(.text1) } : text
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
}

the result will be

  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .rodata           PROGBITS        0000000000000000 001000 000004 00   A  0   0  1
  [ 2] .dynsym           DYNSYM          0000000000000008 001008 000018 18   A  5   1  8
  [ 3] .gnu.hash         GNU_HASH        0000000000000020 001020 00001c 00   A  2   0  8
  [ 4] .hash             HASH            000000000000003c 00103c 000010 04   A  2   0  4
  [ 5] .dynstr           STRTAB          000000000000004c 00104c 000001 00   A  0   0  1
  [ 6] .text1            PROGBITS        000000000000004d 00104d 000001 00  AX  0   0  1
  [ 7] .other            PROGBITS        000000000000004e 00104e 000004 00   A  0   0  1
  [ 8] .orphan           PROGBITS        0000000000000052 001052 000004 00   A  0   0  1
  [ 9] .text2            PROGBITS        0000000000000056 001056 000001 00  AX  0   0  1
  [10] .text             PROGBITS        0000000000000058 001058 000000 00  AX  0   0  4
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA  5   0  8
  [12] .comment          PROGBITS        0000000000000000 0010c8 000013 01  MS  0   0  1
  [13] .symtab           SYMTAB          0000000000000000 0010e0 000030 18     15   2  8
  [14] .shstrtab         STRTAB          0000000000000000 001110 000078 00      0   0  1
  [15] .strtab           STRTAB          0000000000000000 001188 00000a 00      0   0  1

As you can see, the problem here is exactly the same.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 10, 2024

I think the new behavior is expected. Use the example at #93761 (comment). Given

No other linkers put .dynsym and friends into an executable segment. Why do you find this acceptable?

PHDRS { text PT_LOAD; rodata PT_LOAD; }
SECTIONS {
  .text1 : { *(.text1) } : text
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
  .rodata1 : { *(.rodata) } : rodata
}

(I use .rodata1 instead of .rodata to get away with GNU ld's special behavior regarding certain section names.)

GNU ld's rank system does not differentiate .dynsym and PROGBITS read-only sections.
Whether .rodata1 is present determines whether .dynsym will be placed after .text2 or .rodata1.
It doesn't have a condition resembling our hasInputSection.

lld has a more fine-grained rank system and prefers to place .dynsym after the first read-only PROGBITS section, when PHDRS/MEMORY is specified.
The script is underspecified. When .rodata1 is present, .dynsym is previously placed after .rodata1.
While some might prefer this behavior, I don't think the script provides enough signal, and this is not justification to retain the hasInputSection condition.

% ld.bfd -shared test.o -T test.t -o test.so && readelf -WSl test.so
...
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
  [ 3] .text2            PROGBITS        0000000000000005 001005 000001 00  AX  0   0  1
  [ 4] .rodata1          PROGBITS        0000000000000006 001006 000004 00   A  0   0  1
## GNU ld places .orphan and .dynsym after .rodata1
  [ 5] .orphan           PROGBITS        000000000000000a 00100a 000004 00   A  0   0  1
  [ 6] .dynsym           DYNSYM          0000000000000010 001010 000018 18   A  7   1  8
  [ 7] .dynstr           STRTAB          0000000000000028 001028 000001 00   A  0   0  1
  [ 8] .hash             HASH            0000000000000030 001030 000010 04   A  6   0  8
  [ 9] .gnu.hash         GNU_HASH        0000000000000040 001040 00001c 00   A  6   0  8
  [10] .dynamic          DYNAMIC         0000000000000060 001060 0000c0 10  WA  7   0  8
...
% oldld.lld -shared test.o -T test.t -o test.so && readelf -WSl test.so
...
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
## .orphan and .dynsym are not placed here because .other has no input section. If .other has a read-only input section, .orphan and .dynsym would be placed here.
  [ 3] .text2            PROGBITS        0000000000000005 001005 000001 00  AX  0   0  1
  [ 4] .text             PROGBITS        0000000000000008 001008 000000 00  AX  0   0  4
  [ 5] .rodata1          PROGBITS        0000000000000008 001008 000004 00   A  0   0  1
  [ 6] .orphan           PROGBITS        000000000000000c 00100c 000004 00   A  0   0  1
  [ 7] .dynsym           DYNSYM          0000000000000010 001010 000018 18   A 10   1  8
  [ 8] .gnu.hash         GNU_HASH        0000000000000028 001028 00001c 00   A  7   0  8
  [ 9] .hash             HASH            0000000000000044 001044 000010 04   A  7   0  4
  [10] .dynstr           STRTAB          0000000000000054 001054 000001 00   A  0   0  1
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA 10   0  8
...
% newld.lld -shared test.o -T test.t -o test.so && readelf -WSl test.so
...
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
## .dynsym (whose sortRank is smaller than .other's) is placed here, regardless of whether or not .other has a read-only input section.
  [ 3] .dynsym           DYNSYM          0000000000000008 001008 000018 18   A  6   1  8
  [ 4] .gnu.hash         GNU_HASH        0000000000000020 001020 00001c 00   A  3   0  8
  [ 5] .hash             HASH            000000000000003c 00103c 000010 04   A  3   0  4
  [ 6] .dynstr           STRTAB          000000000000004c 00104c 000001 00   A  0   0  1
  [ 7] .text2            PROGBITS        000000000000004d 00104d 000001 00  AX  0   0  1
  [ 8] .text             PROGBITS        0000000000000050 001050 000000 00  AX  0   0  4
  [ 9] .rodata1          PROGBITS        0000000000000050 001050 000004 00   A  0   0  1
  [10] .orphan           PROGBITS        0000000000000054 001054 000004 00   A  0   0  1
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA  6   0  8
...

Moreover, if we change the linker script, following the idea of lld to put read-only sections before executable ones:

[...]
As you can see, the problem here is exactly the same.

What's the problem? .other has a rank of read-only sections. .orphan has the same rank and is placed after .other with the simplified condition.

@igorkudrin
Copy link
Collaborator

What's the problem? .other has a rank of read-only sections. .orphan has the same rank and is placed after .other with the simplified condition.

Oh, I see, I was too concise. I meant that input sections .other if they exist, would also be executable. But, under certain conditions, there might be no .other sections in any input files, so the corresponding output section would just be 'A', not 'AX', which would result in the unexpected placement of orphan sections.

Our usage scenario is not a dedicated linker script carefully prepared for a particular application. It is rather a platform/SDK that provides a set of predefined linker scripts, and all developers should use one of them for their applications; it is generally not possible for a developer to tune the linker scripts they use. Also, it doesn't seem feasible to specify all possible input sections in the linker scripts, so we rely on a predictable placement of orphan sections. Thus, I would prefer not to fix what is not broken, unless there is a clear benefit to doing so.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 10, 2024

What's the problem? .other has a rank of read-only sections. .orphan has the same rank and is placed after .other with the simplified condition.

Oh, I see, I was too concise. I meant that input sections .other if they exist, would also be executable. But, under certain conditions, there might be no .other sections in any input files, so the corresponding output section would just be 'A', not 'AX', which would result in the unexpected placement of orphan sections.

Our usage scenario is not a dedicated linker script carefully prepared for a particular application. It is rather a platform/SDK that provides a set of predefined linker scripts, and all developers should use one of them for their applications; it is generally not possible for a developer to tune the linker scripts they use. Also, it doesn't seem feasible to specify all possible input sections in the linker scripts, so we rely on a predictable placement of orphan sections. Thus, I would prefer not to fix what is not broken, unless there is a clear benefit to doing so.

OK, your argument is that .other (hasInputSection==false) might be A or AX and you would prefer that an orphan is not placed after it.
My argument is that as long as .other (hasInputSection==false) is A, whether or not it has an input section should not change the orphan placement behavior.

I think both arguments are reasonable and no one is significantly better than the other.
I believe we should follow Occam's razor and make the simplification... It's useful to make the rules easier to explain.

.dynsym has a different rank than PROGBITS read-only sections and deserves an explicit output section to make a portable script.

@igorkudrin
Copy link
Collaborator

OK, your argument is that .other (hasInputSection==false) might be A or AX and you would prefer that an orphan is not placed after it.

Not exactly. The output section '.other' could be AX (hasInputSection==true) or A (hasInputSection==false), which could trigger different placement of orphans, which in turn could surprise a developer. I wouldn't mind if its flags were the same regardless of whether it has input sections.

I prefer to follow the principle of least astonishment here.

I think both arguments are reasonable and no one is significantly better than the other. I believe we should follow Occam's razor and make the simplification... It's useful to make the rules easier to explain.

I agree. Which explanations for these two cases do you think of?

@MaskRay
Copy link
Member Author

MaskRay commented Jun 11, 2024

OK, your argument is that .other (hasInputSection==false) might be A or AX and you would prefer that an orphan is not placed after it.

Not exactly. The output section '.other' could be AX (hasInputSection==true) or A (hasInputSection==false), which could trigger different placement of orphans, which in turn could surprise a developer. I wouldn't mind if its flags were the same regardless of whether it has input sections.

I prefer to follow the principle of least astonishment here.

readelf -S output shows A and AX differences, which justifies different orphan section placement.

hasInputSection making behaviors different actually surprised me.
From the output, a hasInputSection==false output section using . += xxx is not really distinguishable from an output section that contains an input section. It is odd to me that the orphan section placement is different (without this patch) when the section flags are not distinguishable.

@igorkudrin
Copy link
Collaborator

readelf -S output shows A and AX differences, which justifies different orphan section placement.

Is it possible to assign more suitable attributes to output sections without input sections? Maybe take them from the closest output sections of the same segment?

@smithp35
Copy link
Collaborator

I have sympathy with MaskRay's position in that it is a somewhat arbitrary rule that OutputSections can't affect orphan placement, and may result in surprise given the existing GNU ld documentation (https://sourceware.org/binutils/docs/ld/Orphan-Sections.html).

Having said that I think we may want to define what type and flags our OutputSections without InputSections get their type to given people more stability.

My understanding of GNU ld is that it will by default give the OutputSection SHT_PROGBITS type and "RAW" flags. This can be altered to READONLY with recent releses, although I don't think LLD supports that.

READONLY ( TYPE = type )
This form of the syntax combines the READONLY type with the type specified by type.

Empiricially LLD seems to inherit the type and flags from a previous OutputSection (I've not checked the code). This largely seems to do the right thing although I'm not convinced that inheriting the "x" flag is the right thing to do.

My, little thought, opening bid would be something like:

However that is separate from this pull request. To progress this one do we have an example of one of these sample linker scripts having a problematic Output Section acting as an anchor for orphans, or is it a theoretical risk?

@igorkudrin
Copy link
Collaborator

To progress this one do we have an example of one of these sample linker scripts having a problematic Output Section acting as an anchor for orphans, or is it a theoretical risk?

So far, the risk is theoretical. I just don't like it when data can be written to an executable segment unexpectedly. Since LinkerScript::adjustOutputSections() propagates SHF_WRITE but not SHF_EXECINSTR, the problematic case could be a linker script that defines read-only sections before executable sections, and among the executable sections there is an empty output section with symbol assignments, data values, or other similar instructions; in this scenario, read-only orphan sections are placed in the executable segment. I can't say how likely this case is, nor how harmful or harmless it might be.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 12, 2024

I have sympathy with MaskRay's position in that it is a somewhat arbitrary rule that OutputSections can't affect orphan placement, and may result in surprise given the existing GNU ld documentation (https://sourceware.org/binutils/docs/ld/Orphan-Sections.html).

Thanks!

Having said that I think we may want to define what type and flags our OutputSections without InputSections get their type to given people more stability.

My understanding of GNU ld is that it will by default give the OutputSection SHT_PROGBITS type and "RAW" flags. This can be altered to READONLY with recent releses, although I don't think LLD supports that.

READONLY ( TYPE = type )
This form of the syntax combines the READONLY type with the type specified by type.

GNU ld since Sep 2020 defaults to "RA" flags. Older versions used "RAW".
. += expr adds "W". BYTE/QUAD don't.
https://sourceware.org/bugzilla/show_bug.cgi?id=26378

Orphan sections use the default type SHT_PROGBITS.

GNU ld introduced "READONLY" in July 2021 (https://sourceware.org/pipermail/binutils/2021-July/117492.html)
"READONLY" gives an explicit flags signal, which can be a nice thing, though we could probably have different syntax (e.g. (FLAGS=aw)) resembling (TYPE=xxx).

Empiricially LLD seems to inherit the type and flags from a previous OutputSection (I've not checked the code). This largely seems to do the right thing although I'm not convinced that inheriting the "x" flag is the right thing to do.

My, little thought, opening bid would be something like:

We inherit SHF_ALLOC|SHF_WRITE flags around LinkerScript.cpp:1253. We used to inherit SHF_EXECINSTR, but I removed it in #70911.
The section type is not inherited.

I have also considered that flags setting syntax can help guide orphan placement.
While a more complex condition like osd && (osd->osec.hasInputSections || osd->osec.hasExplicitFlags) could be used,
opting for the simpler osd condition seems easier to understand.

However that is separate from this pull request. To progress this one do we have an example of one of these sample linker scripts having a problematic Output Section acting as an anchor for orphans, or is it a theoretical risk?

So far, the risk is theoretical. I just don't like it when data can be written to an executable segment unexpectedly. Since LinkerScript::adjustOutputSections() propagates SHF_WRITE but not SHF_EXECINSTR, the problematic case could be a linker script that defines read-only sections before executable sections, and among the executable sections there is an empty output section with symbol assignments, data values, or other similar instructions; in this scenario, read-only orphan sections are placed in the executable segment. I can't say how likely this case is, nor how harmful or harmless it might be.

This (.rodata; .text sec_without_input .text1) is valid concern, though this might be rare and I believe it doesn't justify a particular orphan placement behavior.
I think this is rare because: . += expr leads to a writable section in GNU ld, which makes . padding not useful for code padding (#70911).

Considering this and our shift towards explicit flags syntax, removing hasInputSections seems viable, simplifying the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants