-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[ELF] adjustOutputSections: don't copy SHF_EXECINSTR when an output does not contain input sections #70911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld Author: Fangrui Song (MaskRay) ChangesFor an output section with no input section, GNU ld eliminates the We choose to retain such an output section with a symbol assignment
However, this I think we should drop SHF_EXECINSTR when copying For data-command-only output sections (e.g. Some tests are updated to drop the SHF_EXECINSTR flag. GNU ld doesn't Full diff: https://github.com/llvm/llvm-project/pull/70911.diff 6 Files Affected:
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index df091613dc0a144..198c63bf0a8a24b 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1193,8 +1193,8 @@ void LinkerScript::adjustOutputSections() {
// We do not want to keep any special flags for output section
// in case it is empty.
if (isEmpty)
- sec->flags = flags & ((sec->nonAlloc ? 0 : (uint64_t)SHF_ALLOC) |
- SHF_WRITE | SHF_EXECINSTR);
+ sec->flags =
+ flags & ((sec->nonAlloc ? 0 : (uint64_t)SHF_ALLOC) | SHF_WRITE);
// The code below may remove empty output sections. We should save the
// specified program headers (if exist) and propagate them to subsequent
diff --git a/lld/docs/ELF/linker_script.rst b/lld/docs/ELF/linker_script.rst
index fbd96abcb44c5b1..3606ef4fe4b8ee1 100644
--- a/lld/docs/ELF/linker_script.rst
+++ b/lld/docs/ELF/linker_script.rst
@@ -97,6 +97,16 @@ The presence of ``address`` can cause the condition unsatisfied. LLD will warn.
GNU ld from Binutils 2.35 onwards will reduce sh_addralign so that
sh_addr=0 (modulo sh_addralign).
+When an output section has no input section, GNU ld will eliminate it if it
+only contains symbol assignments (e.g. ``.foo { symbol = 42; }``). LLD will
+retain such sections unless all the symbol assignments are unreferenced
+``PROVIDED``.
+
+When an output section has no input section but advances the location counter,
+GNU ld sets the ``SHF_WRITE`` flag. LLD sets the SHF_WRITE flag only if the
+preceding output section with non-empty input sections also has the SHF_WRITE
+flag.
+
Output section type
-------------------
diff --git a/lld/test/ELF/linkerscript/empty-synthetic-removed-flags.s b/lld/test/ELF/linkerscript/empty-synthetic-removed-flags.s
index b8d72702425f19f..dfc502e8da9e1d4 100644
--- a/lld/test/ELF/linkerscript/empty-synthetic-removed-flags.s
+++ b/lld/test/ELF/linkerscript/empty-synthetic-removed-flags.s
@@ -29,7 +29,6 @@
# EMPTY-NEXT: Type: SHT_PROGBITS
# EMPTY-NEXT: Flags [
# EMPTY-NEXT: SHF_ALLOC
-# EMPTY-NEXT: SHF_EXECINSTR
# EMPTY-NEXT: ]
.section .foo,"ax"
diff --git a/lld/test/ELF/linkerscript/extend-pt-load2.test b/lld/test/ELF/linkerscript/extend-pt-load2.test
index 3e91f3b4f162d07..81f5379681e5775 100644
--- a/lld/test/ELF/linkerscript/extend-pt-load2.test
+++ b/lld/test/ELF/linkerscript/extend-pt-load2.test
@@ -16,10 +16,11 @@ SECTIONS {
.data.rel.ro : { *(.data.rel.ro) }
}
-# CHECK: .text PROGBITS 00000000000001f4 0001f4 000001 00 AX
-# CHECK-NEXT: bar PROGBITS 00000000000001f5 0001f5 000e0b 00 AX
-# CHECK-NEXT: .data.rel.ro PROGBITS 0000000000001000 001000 000001 00 WA
+# CHECK: .text PROGBITS 000000000000022c 00022c 000001 00 AX 0
+# CHECK-NEXT: bar PROGBITS 000000000000022d 00022d 000dd3 00 A 0
+# CHECK-NEXT: .data.rel.ro PROGBITS 0000000000001000 001000 000001 00 WA 0
-# CHECK: LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0001f1 0x0001f1 R 0x1000
-# CHECK: LOAD 0x0001f4 0x00000000000001f4 0x00000000000001f4 0x000e0c 0x000e0c R E 0x1000
+# CHECK: LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000229 0x000229 R 0x1000
+# CHECK-NEXT: LOAD 0x00022c 0x000000000000022c 0x000000000000022c 0x000001 0x000001 R E 0x1000
+# CHECK-NEXT: LOAD 0x00022d 0x000000000000022d 0x000000000000022d 0x000dd3 0x000dd3 R 0x1000
# CHECK-NEXT: LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000068 0x000068 RW 0x1000
diff --git a/lld/test/ELF/linkerscript/insert-before.test b/lld/test/ELF/linkerscript/insert-before.test
index f9611538c013b2b..e6ed413639827a9 100644
--- a/lld/test/ELF/linkerscript/insert-before.test
+++ b/lld/test/ELF/linkerscript/insert-before.test
@@ -9,13 +9,14 @@
# RUN: llvm-readelf -S -l %t1 | FileCheck %s
# CHECK: Name Type Address Off Size ES Flg
# CHECK-NEXT: NULL
-# CHECK-NEXT: .foo.text PROGBITS 0000000000000000 001000 000008 00 AX
-# CHECK-NEXT: .text PROGBITS 0000000000000008 001008 000008 00 AX
-# CHECK-NEXT: .byte PROGBITS 0000000000000010 001010 000001 00 AX
-# CHECK-NEXT: .foo.data PROGBITS 0000000000000011 001011 000008 00 WA
-# CHECK-NEXT: .data PROGBITS 0000000000000019 001019 000008 00 WA
+# CHECK-NEXT: .foo.text PROGBITS 0000000000000000 001000 000008 00 AX 0
+# CHECK-NEXT: .text PROGBITS 0000000000000008 001008 000008 00 AX 0
+# CHECK-NEXT: .byte PROGBITS 0000000000000010 001010 000001 00 A 0
+# CHECK-NEXT: .foo.data PROGBITS 0000000000000011 001011 000008 00 WA 0
+# CHECK-NEXT: .data PROGBITS 0000000000000019 001019 000008 00 WA 0
# CHECK: Type
# CHECK-NEXT: LOAD {{.*}} R E
+# CHECK-NEXT: LOAD {{.*}} R
# CHECK-NEXT: LOAD {{.*}} RW
# CHECK-NEXT: GNU_STACK {{.*}} RW
diff --git a/lld/test/ELF/linkerscript/lma-align2.test b/lld/test/ELF/linkerscript/lma-align2.test
index 2177101ff37475d..23b06b179b6e85d 100644
--- a/lld/test/ELF/linkerscript/lma-align2.test
+++ b/lld/test/ELF/linkerscript/lma-align2.test
@@ -7,17 +7,17 @@
# CHECK: Name Type Address Off Size ES Flg Lk Inf Al
# CHECK-NEXT: NULL 0000000000000000 000000 000000 00 0 0 0
# CHECK-NEXT: .text PROGBITS 0000000008000000 001000 000001 00 AX 0 0 4
-# CHECK-NEXT: .data PROGBITS 0000000020000000 002000 000006 00 AX 0 0 8
-# CHECK-NEXT: .data2 PROGBITS 000000000800000e 00200e 000008 00 AX 0 0 1
-# CHECK-NEXT: .data3 PROGBITS 0000000020000008 002018 000000 00 AX 0 0 8
-# CHECK-NEXT: .data4 PROGBITS 0000000008000018 002018 000008 00 AX 0 0 1
+# CHECK-NEXT: .data PROGBITS 0000000020000000 002000 000006 00 A 0 0 8
+# CHECK-NEXT: .data2 PROGBITS 000000000800000e 00200e 000008 00 A 0 0 1
+# CHECK-NEXT: .data3 PROGBITS 0000000020000008 002018 000000 00 A 0 0 8
+# CHECK-NEXT: .data4 PROGBITS 0000000008000018 002018 000008 00 A 0 0 1
# CHECK: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
# CHECK-NEXT: LOAD 0x001000 0x0000000008000000 0x0000000008000000 0x000001 0x000001 R E 0x1000
-# CHECK-NEXT: LOAD 0x002000 0x0000000020000000 0x0000000008000008 0x000006 0x000006 R E 0x1000
-# CHECK-NEXT: LOAD 0x00200e 0x000000000800000e 0x000000000800000e 0x000008 0x000008 R E 0x1000
-# CHECK-NEXT: LOAD 0x002018 0x0000000008000018 0x0000000008000018 0x000008 0x000008 R E 0x1000
+# CHECK-NEXT: LOAD 0x002000 0x0000000020000000 0x0000000008000008 0x000006 0x000006 R 0x1000
+# CHECK-NEXT: LOAD 0x00200e 0x000000000800000e 0x000000000800000e 0x000008 0x000008 R 0x1000
+# CHECK-NEXT: LOAD 0x002018 0x0000000008000018 0x0000000008000018 0x000008 0x000008 R 0x1000
MEMORY {
CODE (rx) : ORIGIN = 0x08000000, LENGTH = 100K
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an improvement and the code-changes look good to achieve what is in the specification.
I don't think that this change would fully address the change in https://discourse.llvm.org/t/lld-output-section-flag-assignment-behavior/74359 as that had:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .code PROGBITS 0000000010000000 001000 000003 00 AX 0 0 1
[ 2] .CPU0.heap NOBITS 0000000010000040 001003 000400 00 AX 0 0 64
[ 3] .CPU1.heap NOBITS 0000000010000440 001003 000400 00 AX 0 0 64
[ 4] .CPU2.heap NOBITS 0000000010000840 001003 000400 00 AX 0 0 64
With this change I think we would have flags of A instead of AX. Leaving a non writeable BSS section which looks a bit odd.
If we start from the assumption that OutputSections with just symbol assignments are most likely to be data then dropping SHF_EXECINSTR makes sense. In the general case (SHT_PROGBITS) we can't tell if the user intended them to be read-only or read-write so we can't justify adding SHF_WRITE. However if the user has used NOLOAD or TYPE=SHT_NOBITS then I think there is a case for adding SHF_WRITE.
I don't have a particularly strong opinion here so happy to go with your original patch if you prefer.
@@ -1193,8 +1193,8 @@ void LinkerScript::adjustOutputSections() { | |||
// We do not want to keep any special flags for output section | |||
// in case it is empty. | |||
if (isEmpty) | |||
sec->flags = flags & ((sec->nonAlloc ? 0 : (uint64_t)SHF_ALLOC) | | |||
SHF_WRITE | SHF_EXECINSTR); | |||
sec->flags = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth updating the comment on line 1170
// The other option is to pick flags that minimize the impact the section
// will have on the rest of the linker. That is why we copy the flags from
// the previous sections. Only a few flags are needed to keep the impact low.
Perhaps
"// We do not propagate SHF_EXECINSTR as in some cases this can lead to executable writeable SHT_NOBITS section."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Reworded a bit
Hi, I think this PR improves LLD by giving consistent behavior by removing the X flag. But I think that @smithp35 pointed out an interesting improvement (maybe for the future) to include W in specific situations. Semantically, some specific NOLOAD/NOBITS can also have W because they don't provide useful data to be read-only. |
This change makes our behavior similar to GNU ld regarding X. Regarding W, We could make a special case to say every |
2964325
to
25bdac6
Compare
…oes not contain input sections For an output section with no input section, GNU ld eliminates the output section when there are only symbol assignments (e.g. `.foo : { symbol = 42; }`) but not for `.foo : { . += 42; }` (`SHF_ALLOC|SHF_WRITE`). We choose to retain such an output section with a symbol assignment (unless unreferenced `PROVIDE`). We copy the previous section flag (see https://reviews.llvm.org/D37736) to hopefully make the current PT_LOAD segment extend to the current output section: * decrease the number of PT_LOAD segments * If a new PT_LOAD segment is introduced without a page-size alignment as a separator, there may be a run-time crash. However, this `flags` copying behavior is not suitable for `.foo : { . += 42; }` when `flags` contains `SHF_EXECINSTR`. The executable bit is surprising (https://discourse.llvm.org/t/lld-output-section-flag-assignment-behavior/74359). I think we should drop SHF_EXECINSTR when copying `flags`. The risk is a code section followed by `.foo : { symbol = 42; }` will be broken, which I believe is unrelated as such uses are almost always related to data sections. For data-command-only output sections (e.g. `.foo : { QUAD(42) }`), we keep allowing copyable SHF_WRITE. Some tests are updated to drop the SHF_EXECINSTR flag. GNU ld doesn't set SHF_EXECINSTR as well, though it sets SHF_WRITE for some tests while we don't.
25bdac6
to
25bc205
Compare
For an output section with no input section, GNU ld eliminates the
output section when there are only symbol assignments (e.g.
.foo : { symbol = 42; }
) but not for.foo : { . += 42; }
(
SHF_ALLOC|SHF_WRITE
).We choose to retain such an output section with a symbol assignment
(unless unreferenced
PROVIDE
). We copy the previous section flag (seehttps://reviews.llvm.org/D37736) to hopefully make the current PT_LOAD
segment extend to the current output section:
alignment as a separator, there may be a run-time crash.
However, this
flags
copying behavior is not suitable for.foo : { . += 42; }
whenflags
containsSHF_EXECINSTR
. Theexecutable bit is surprising
(https://discourse.llvm.org/t/lld-output-section-flag-assignment-behavior/74359).
I think we should drop SHF_EXECINSTR when copying
flags
. The risk is acode section followed by
.foo : { symbol = 42; }
will be broken, whichI believe is unrelated as such uses are almost always related to data
sections.
For data-command-only output sections (e.g.
.foo : { QUAD(42) }
), wekeep allowing copyable SHF_WRITE.
Some tests are updated to drop the SHF_EXECINSTR flag. GNU ld doesn't
set SHF_EXECINSTR as well, though it sets SHF_WRITE for some tests while
we don't.