[LLD][AArch64] Mark .plt with PURECODE flag if all input sections also have it #132224

Il-Capitano · 2025-03-20T14:45:33Z

Mark the synthetic .plt section with the SHF_AARCH64_PURECODE section flag if all executable input sections also have that flag.

Without this change, if we were to compile a binary with -mexecute-only, the final executable will only have .plt not marked with the section flag, causing it to be placed in a different load segment. This leads to an extra page's worth of memory usage unnecessarily when running the executable.

A similar issue happens if we always set the section flag on .plt and compile a binary without -mexecute-only, so the solution should match the SHF_AARCH64_PURECODE section flags between .plt and all other executable sections.

…o have it Mark the synthetic `.plt` section with the `SHF_AARCH64_PURECODE` section flag if all executable input sections also have that flag. Without this change, if we were to compile a binary with `-mexecute-only`, the final executable will only have `.plt` not marked with the section flag, causing it to be placed in a different load segment. This leads to an extra page's worth of memory usage unnecessarily when running the executable. A similar issue happens if we always set the section flag on `.plt` and compile a binary without `-mexecute-only`, so the solution should match the `SHF_AARCH64_PURECODE` section flags between `.plt` and all other executable sections.

llvmbot · 2025-03-20T14:46:07Z

@llvm/pr-subscribers-lld-elf

Author: Csanád Hajdú (Il-Capitano)

Changes

Mark the synthetic .plt section with the SHF_AARCH64_PURECODE section flag if all executable input sections also have that flag.

Without this change, if we were to compile a binary with -mexecute-only, the final executable will only have .plt not marked with the section flag, causing it to be placed in a different load segment. This leads to an extra page's worth of memory usage unnecessarily when running the executable.

A similar issue happens if we always set the section flag on .plt and compile a binary without -mexecute-only, so the solution should match the SHF_AARCH64_PURECODE section flags between .plt and all other executable sections.

Full diff: https://github.com/llvm/llvm-project/pull/132224.diff

2 Files Affected:

(modified) lld/ELF/SyntheticSections.cpp (+12)
(added) lld/test/ELF/aarch64-execute-only-plt.s (+115)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index b03c4282ab1aa..a7ff8ed9b16d1 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2610,6 +2610,18 @@ PltSection::PltSection(Ctx &ctx)
   // modify the instructions in the PLT entries.
   if (ctx.arg.emachine == EM_SPARCV9)
     this->flags |= SHF_WRITE;
+
+  // On AArch64, PLT entries only do loads from the .got.plt section, so the
+  // .plt section can be marked with the SHF_AARCH64_PURECODE section flag. We
+  // only do this if all other executable sections also have the same section
+  // flag set, because otherwise .plt can't be allocated in the same segment as
+  // the other executable sections.
+  if (ctx.arg.emachine == EM_AARCH64 &&
+      all_of(ctx.inputSections, [](InputSectionBase *sec) {
+        return !(sec->flags & SHF_EXECINSTR) ||
+               (sec->flags & SHF_AARCH64_PURECODE);
+      }))
+    this->flags |= SHF_AARCH64_PURECODE;
 }
 
 void PltSection::writeTo(uint8_t *buf) {
diff --git a/lld/test/ELF/aarch64-execute-only-plt.s b/lld/test/ELF/aarch64-execute-only-plt.s
new file mode 100644
index 0000000000000..08e69fba8fb0c
--- /dev/null
+++ b/lld/test/ELF/aarch64-execute-only-plt.s
@@ -0,0 +1,115 @@
+// REQUIRES: aarch64
+// RUN: rm -rf %t && split-file %s %t && cd %t
+
+// RUN: llvm-mc -filetype=obj -triple=aarch64 start.s -o start.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-xo-same-section.s -o foo-xo-same-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-rx-same-section.s -o foo-rx-same-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-xo-different-section.s -o foo-xo-different-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-rx-different-section.s -o foo-rx-different-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 %p/Inputs/plt-aarch64.s -o plt.o
+// RUN: ld.lld -shared plt.o -soname=t2.so -o plt.so
+// RUN: ld.lld start.o foo-xo-same-section.o plt.so -o xo-same-section
+// RUN: ld.lld start.o foo-rx-same-section.o plt.so -o rx-same-section
+// RUN: ld.lld start.o foo-xo-different-section.o plt.so -o xo-different-section
+// RUN: ld.lld start.o foo-rx-different-section.o plt.so -o rx-different-section
+// RUN: llvm-readobj -S -l xo-same-section | FileCheck --check-prefix=CHECK-XO %s
+// RUN: llvm-readobj -S -l rx-same-section | FileCheck --check-prefix=CHECK-RX %s
+// RUN: llvm-readobj -S -l xo-different-section | FileCheck --check-prefix=CHECK-XO %s
+// RUN: llvm-readobj -S -l rx-different-section | FileCheck --check-prefix=CHECK-RX %s
+// RUN: llvm-objdump -d --no-show-raw-insn xo-same-section | FileCheck --check-prefix=DISASM %s
+// RUN: llvm-objdump -d --no-show-raw-insn rx-same-section | FileCheck --check-prefix=DISASM %s
+// RUN: llvm-objdump -d --no-show-raw-insn xo-different-section | FileCheck --check-prefix=DISASM %s
+// RUN: llvm-objdump -d --no-show-raw-insn rx-different-section | FileCheck --check-prefix=DISASM %s
+
+// CHECK-XO:         Name: .plt
+// CHECK-XO-NEXT:    Type: SHT_PROGBITS
+// CHECK-XO-NEXT:    Flags [
+// CHECK-XO-NEXT:      SHF_AARCH64_PURECODE
+// CHECK-XO-NEXT:      SHF_ALLOC
+// CHECK-XO-NEXT:      SHF_EXECINSTR
+// CHECK-XO-NEXT:    ]
+// CHECK-XO-NEXT:    Address: 0x2102E0
+
+/// The address of .plt above should be within this program header.
+// CHECK-XO:         VirtualAddress: 0x2102C8
+// CHECK-XO-NEXT:    PhysicalAddress: 0x2102C8
+// CHECK-XO-NEXT:    FileSize: 88
+// CHECK-XO-NEXT:    MemSize: 88
+// CHECK-XO-NEXT:    Flags [
+// CHECK-XO-NEXT:      PF_X
+// CHECK-XO-NEXT:    ]
+
+// CHECK-RX:         Name: .plt
+// CHECK-RX-NEXT:    Type: SHT_PROGBITS
+// CHECK-RX-NEXT:    Flags [
+// CHECK-RX-NEXT:      SHF_ALLOC
+// CHECK-RX-NEXT:      SHF_EXECINSTR
+// CHECK-RX-NEXT:    ]
+// CHECK-RX-NEXT:    Address: 0x2102E0
+
+/// The address of .plt above should be within this program header.
+// CHECK-RX:         VirtualAddress: 0x2102C8
+// CHECK-RX-NEXT:    PhysicalAddress: 0x2102C8
+// CHECK-RX-NEXT:    FileSize: 88
+// CHECK-RX-NEXT:    MemSize: 88
+// CHECK-RX-NEXT:    Flags [
+// CHECK-RX-NEXT:      PF_R
+// CHECK-RX-NEXT:      PF_X
+// CHECK-RX-NEXT:    ]
+
+// DISASM-LABEL: Disassembly of section .plt:
+// DISASM-LABEL: <.plt>:
+// DISASM-NEXT:  2102e0: stp  x16, x30, [sp, #-0x10]!
+// DISASM-NEXT:          adrp x16, 0x230000 <weak+0x230000>
+// DISASM-NEXT:          ldr  x17, [x16, #0x400]
+// DISASM-NEXT:          add  x16, x16, #0x400
+// DISASM-NEXT:          br   x17
+// DISASM-NEXT:          nop
+// DISASM-NEXT:          nop
+// DISASM-NEXT:          nop
+
+// DISASM-LABEL: <bar@plt>:
+// DISASM-NEXT:  210300: adrp x16, 0x230000 <weak+0x230000>
+// DISASM-NEXT:          ldr  x17, [x16, #0x408]
+// DISASM-NEXT:          add  x16, x16, #0x408
+// DISASM-NEXT:          br   x17
+
+// DISASM-LABEL: <weak@plt>:
+// DISASM-NEXT:  210310: adrp x16, 0x230000 <weak+0x230000>
+// DISASM-NEXT:          ldr  x17, [x16, #0x410]
+// DISASM-NEXT:          add  x16, x16, #0x410
+// DISASM-NEXT:          br   x17
+
+//--- start.s
+.section .text,"axy",@progbits,unique,0
+.global _start, foo, bar
+.weak weak
+_start:
+  bl foo
+  bl bar
+  bl weak
+  ret
+
+//--- foo-xo-same-section.s
+.section .text,"axy",@progbits,unique,0
+.global foo
+foo:
+  ret
+
+//--- foo-rx-same-section.s
+.section .text,"ax",@progbits,unique,0
+.global foo
+foo:
+  ret
+
+//--- foo-xo-different-section.s
+.section .foo,"axy",@progbits,unique,0
+.global foo
+foo:
+  ret
+
+//--- foo-rx-different-section.s
+.section .foo,"ax",@progbits,unique,0
+.global foo
+foo:
+  ret

llvmbot · 2025-03-20T14:46:07Z

@llvm/pr-subscribers-lld

Author: Csanád Hajdú (Il-Capitano)

Changes

Mark the synthetic .plt section with the SHF_AARCH64_PURECODE section flag if all executable input sections also have that flag.

Without this change, if we were to compile a binary with -mexecute-only, the final executable will only have .plt not marked with the section flag, causing it to be placed in a different load segment. This leads to an extra page's worth of memory usage unnecessarily when running the executable.

A similar issue happens if we always set the section flag on .plt and compile a binary without -mexecute-only, so the solution should match the SHF_AARCH64_PURECODE section flags between .plt and all other executable sections.

Full diff: https://github.com/llvm/llvm-project/pull/132224.diff

2 Files Affected:

(modified) lld/ELF/SyntheticSections.cpp (+12)
(added) lld/test/ELF/aarch64-execute-only-plt.s (+115)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index b03c4282ab1aa..a7ff8ed9b16d1 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2610,6 +2610,18 @@ PltSection::PltSection(Ctx &ctx)
   // modify the instructions in the PLT entries.
   if (ctx.arg.emachine == EM_SPARCV9)
     this->flags |= SHF_WRITE;
+
+  // On AArch64, PLT entries only do loads from the .got.plt section, so the
+  // .plt section can be marked with the SHF_AARCH64_PURECODE section flag. We
+  // only do this if all other executable sections also have the same section
+  // flag set, because otherwise .plt can't be allocated in the same segment as
+  // the other executable sections.
+  if (ctx.arg.emachine == EM_AARCH64 &&
+      all_of(ctx.inputSections, [](InputSectionBase *sec) {
+        return !(sec->flags & SHF_EXECINSTR) ||
+               (sec->flags & SHF_AARCH64_PURECODE);
+      }))
+    this->flags |= SHF_AARCH64_PURECODE;
 }
 
 void PltSection::writeTo(uint8_t *buf) {
diff --git a/lld/test/ELF/aarch64-execute-only-plt.s b/lld/test/ELF/aarch64-execute-only-plt.s
new file mode 100644
index 0000000000000..08e69fba8fb0c
--- /dev/null
+++ b/lld/test/ELF/aarch64-execute-only-plt.s
@@ -0,0 +1,115 @@
+// REQUIRES: aarch64
+// RUN: rm -rf %t && split-file %s %t && cd %t
+
+// RUN: llvm-mc -filetype=obj -triple=aarch64 start.s -o start.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-xo-same-section.s -o foo-xo-same-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-rx-same-section.s -o foo-rx-same-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-xo-different-section.s -o foo-xo-different-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 foo-rx-different-section.s -o foo-rx-different-section.o
+// RUN: llvm-mc -filetype=obj -triple=aarch64 %p/Inputs/plt-aarch64.s -o plt.o
+// RUN: ld.lld -shared plt.o -soname=t2.so -o plt.so
+// RUN: ld.lld start.o foo-xo-same-section.o plt.so -o xo-same-section
+// RUN: ld.lld start.o foo-rx-same-section.o plt.so -o rx-same-section
+// RUN: ld.lld start.o foo-xo-different-section.o plt.so -o xo-different-section
+// RUN: ld.lld start.o foo-rx-different-section.o plt.so -o rx-different-section
+// RUN: llvm-readobj -S -l xo-same-section | FileCheck --check-prefix=CHECK-XO %s
+// RUN: llvm-readobj -S -l rx-same-section | FileCheck --check-prefix=CHECK-RX %s
+// RUN: llvm-readobj -S -l xo-different-section | FileCheck --check-prefix=CHECK-XO %s
+// RUN: llvm-readobj -S -l rx-different-section | FileCheck --check-prefix=CHECK-RX %s
+// RUN: llvm-objdump -d --no-show-raw-insn xo-same-section | FileCheck --check-prefix=DISASM %s
+// RUN: llvm-objdump -d --no-show-raw-insn rx-same-section | FileCheck --check-prefix=DISASM %s
+// RUN: llvm-objdump -d --no-show-raw-insn xo-different-section | FileCheck --check-prefix=DISASM %s
+// RUN: llvm-objdump -d --no-show-raw-insn rx-different-section | FileCheck --check-prefix=DISASM %s
+
+// CHECK-XO:         Name: .plt
+// CHECK-XO-NEXT:    Type: SHT_PROGBITS
+// CHECK-XO-NEXT:    Flags [
+// CHECK-XO-NEXT:      SHF_AARCH64_PURECODE
+// CHECK-XO-NEXT:      SHF_ALLOC
+// CHECK-XO-NEXT:      SHF_EXECINSTR
+// CHECK-XO-NEXT:    ]
+// CHECK-XO-NEXT:    Address: 0x2102E0
+
+/// The address of .plt above should be within this program header.
+// CHECK-XO:         VirtualAddress: 0x2102C8
+// CHECK-XO-NEXT:    PhysicalAddress: 0x2102C8
+// CHECK-XO-NEXT:    FileSize: 88
+// CHECK-XO-NEXT:    MemSize: 88
+// CHECK-XO-NEXT:    Flags [
+// CHECK-XO-NEXT:      PF_X
+// CHECK-XO-NEXT:    ]
+
+// CHECK-RX:         Name: .plt
+// CHECK-RX-NEXT:    Type: SHT_PROGBITS
+// CHECK-RX-NEXT:    Flags [
+// CHECK-RX-NEXT:      SHF_ALLOC
+// CHECK-RX-NEXT:      SHF_EXECINSTR
+// CHECK-RX-NEXT:    ]
+// CHECK-RX-NEXT:    Address: 0x2102E0
+
+/// The address of .plt above should be within this program header.
+// CHECK-RX:         VirtualAddress: 0x2102C8
+// CHECK-RX-NEXT:    PhysicalAddress: 0x2102C8
+// CHECK-RX-NEXT:    FileSize: 88
+// CHECK-RX-NEXT:    MemSize: 88
+// CHECK-RX-NEXT:    Flags [
+// CHECK-RX-NEXT:      PF_R
+// CHECK-RX-NEXT:      PF_X
+// CHECK-RX-NEXT:    ]
+
+// DISASM-LABEL: Disassembly of section .plt:
+// DISASM-LABEL: <.plt>:
+// DISASM-NEXT:  2102e0: stp  x16, x30, [sp, #-0x10]!
+// DISASM-NEXT:          adrp x16, 0x230000 <weak+0x230000>
+// DISASM-NEXT:          ldr  x17, [x16, #0x400]
+// DISASM-NEXT:          add  x16, x16, #0x400
+// DISASM-NEXT:          br   x17
+// DISASM-NEXT:          nop
+// DISASM-NEXT:          nop
+// DISASM-NEXT:          nop
+
+// DISASM-LABEL: <bar@plt>:
+// DISASM-NEXT:  210300: adrp x16, 0x230000 <weak+0x230000>
+// DISASM-NEXT:          ldr  x17, [x16, #0x408]
+// DISASM-NEXT:          add  x16, x16, #0x408
+// DISASM-NEXT:          br   x17
+
+// DISASM-LABEL: <weak@plt>:
+// DISASM-NEXT:  210310: adrp x16, 0x230000 <weak+0x230000>
+// DISASM-NEXT:          ldr  x17, [x16, #0x410]
+// DISASM-NEXT:          add  x16, x16, #0x410
+// DISASM-NEXT:          br   x17
+
+//--- start.s
+.section .text,"axy",@progbits,unique,0
+.global _start, foo, bar
+.weak weak
+_start:
+  bl foo
+  bl bar
+  bl weak
+  ret
+
+//--- foo-xo-same-section.s
+.section .text,"axy",@progbits,unique,0
+.global foo
+foo:
+  ret
+
+//--- foo-rx-same-section.s
+.section .text,"ax",@progbits,unique,0
+.global foo
+foo:
+  ret
+
+//--- foo-xo-different-section.s
+.section .foo,"axy",@progbits,unique,0
+.global foo
+foo:
+  ret
+
+//--- foo-rx-different-section.s
+.section .foo,"ax",@progbits,unique,0
+.global foo
+foo:
+  ret

Il-Capitano · 2025-03-20T14:51:37Z

I had some concern about the performance impact of looping through all input sections when creating the .plt section, so I did some measurements of building Clang with -mexecute-only -ffunction-sections on an AArch64 machine (I verified that the final binary has the correct flags set).

I couldn't measure any time difference in the linking step between the old and new versions, it was within noise. So I don't think performance of the loop is a concern. In the common, non-execute-only case, short circuiting in all_of should also prevent any noticeable time difference.

smithp35

I've made a suggestion that I think will work for non-degenerate cases that avoids the loop.

smithp35 · 2025-03-20T18:21:57Z

lld/ELF/SyntheticSections.cpp

@@ -2610,6 +2610,18 @@ PltSection::PltSection(Ctx &ctx)
  // modify the instructions in the PLT entries.
  if (ctx.arg.emachine == EM_SPARCV9)
    this->flags |= SHF_WRITE;
+


Alternatively it should be possible to universally set SHF_AARCH64_PURECODE and then
handle this in Writer.cpp::createPhdrs()

https://github.com/llvm/llvm-project/blob/main/lld/ELF/Writer.cpp#L2381

uint64_t newFlags = computeFlags(ctx, sec->getPhdrFlags()); // When --no-rosegment is specified, RO and RX sections are compatible. uint32_t incompatible = flags ^ newFlags; if (ctx.arg.singleRoRx && !(newFlags & PF_W)) incompatible &= ~PF_X;

Something like:

if (sec == ctx.in.plt && (flags & PF_R)) newFlags |= PF_R;

It is true that the .plt could in theory be the first section, but this would normally take a linker script making it the first OutputSection, but I think that's unlikely, and could be fixed with PHDRS.

I did think we might do this for all OutputSections but I guess for bare-metal there's still a use case for separate XO and non-XO segments.

Another possibility is to record any non-XO OutputSection that we see in ctx.

sec can't be compared with ctx.in.plt there, because it is an output section, and ctx.in.plt is an input section. We'd have to do findSection(ctx, ".plt") in order to get the .plt output section.

Another concern I have with manipulating the output sections directly is that maybe one of the non-synthetic input sections might be placed in the .plt output section? I'm not sure if this really happens with real code, but I'd rather write a solution that works in every case by using SHF_AARCH64_PURECODE correctly on the input sections.

For an alternate solution, another possible spot I found where we can modify the flags of ctx.in.plt is in this loop inside addOrphanSections:

llvm-project/lld/ELF/LinkerScript.cpp

Lines 1015 to 1040 in 77edfbb

// For further --emit-reloc handling code we need target output section

// to be created before we create relocation output section, so we want

// to create target sections first. We do not want priority handling

// for synthetic sections because them are special.

size_t n = 0;

for (InputSectionBase *isec : ctx.inputSections) {

// Process InputSection and MergeInputSection.

if (LLVM_LIKELY(isa<InputSection>(isec)))

ctx.inputSections[n++] = isec;

// In -r links, SHF_LINK_ORDER sections are added while adding their parent

// sections because we need to know the parent's output section before we

// can select an output section for the SHF_LINK_ORDER section.

if (ctx.arg.relocatable && (isec->flags & SHF_LINK_ORDER))

continue;

if (auto *sec = dyn_cast<InputSection>(isec))

if (InputSectionBase *rel = sec->getRelocatedSection())

if (auto *relIS = dyn_cast_or_null<InputSectionBase>(rel->parent))

add(relIS);

add(isec);

if (ctx.arg.relocatable)

for (InputSectionBase *depSec : isec->dependentSections)

if (depSec->flags & SHF_LINK_ORDER)

add(depSec);

}

Adding something like this here works:

// Only check for PURECODE flag on AArch64 to decide if .plt should have the // flag as well or not. bool isAllPurecode = ctx.arg.emachine == EM_AARCH64; for (InputSectionBase *isec : ctx.inputSections) { isAllPurecode = isAllPurecode && (isa<SyntheticSection>(isec) || !(isec->flags & SHF_EXECINSTR) || (isec->flags & SHF_AARCH64_PURECODE)); // ... } if (isAllPurecode) ctx.in.plt->flags |= SHF_AARCH64_PURECODE;

We can save looping through the input sections an extra time in the PltSection constructor, but the logic gets decoupled from PltSection, which I'm not a fan of. What do you think about this?

Apologies for the long comment!

It is possible to find the OutputSection that contains the .plt, it would be something like .in.plt->parent. That would mean that we would only need to check OutputSections rather than input sections. If the .plt is mixed with non XO InputSections then the OutputSection is in will be non-XO. However ...

Taking a step back, I think it will be worth thinking through what the heuristics for Program Header generation are when it comes to XO. Apologies I didn't have time to write this up yesterday Evening. I think there could be more than just the .plt that is affected.

In principle any orphan section with SHF_PURECODE (that generates an OutputSection) will propagate SHF_PURECODE to the OutputSection, which is going to auto-generate an XO program header on a transition from non-XO, which isn't going to be helpful for a non-XO program. How much of a problem this is I don't know. For an Android/Linux system needing full XO, there may be a non-zero number of libraries that need SHF_PURECODE just in case they are used in an XO context. In a contrived worst case we have alternate XO, non-XO output sections and get a separate program header for each OutputSection.

Thinking of a model for how this would be used, I think we have two (possibly three) cases:

Bare-metal system (how XO is currently used on Arm), no PLT, no dynamic linking, linker script, potential mix of XO (my code) and non XO (library code).

An OS that can support all XO or non-XO for a particular program, PLT highly likely, default linker script, dynamic linking. I'm guessing this is where Android will be heading.

An OS that can support separate parts of the program being XO and non-XO (presumably separated by a page boundary). I don't think that anyone needs/wants this level of flexibility.

For the bare-metal system we would like to have separate XO and non-XO program headers for the same output file. It is up to the user to write the linker script to separate out the XO and non-XO into distinct memory regions, and possibly use PHDRS to make sure they get what they need.

For the OS that can only have a program thats XO or not XO, we ideally want all executable OutputSections to be XO before generating an XO program header.

For the OS that can have multiple XO and non XO parts, then there's no good simple heuristic that I can think of that's always going to work. However I think we can probably rule this use case out.

With that in mind I propose that we do something like:

Unconditionally add SHF_PURECODE to the .plt.

For a program using an OS (defined as having a dynamic section, or a PLT), then when auto generating program headers (no linker script PHDRS), then clear SHF_PURECODE from all executable OutputSections if at least on executable OutputSection is not XO.

Leave behaviour as it is for bare-metal programs (that don't have a PLT or dynamic section).

Not sure I've got that completely right, but it should be close. I think that could be applied in createPhdrs().

The alternative view is that this is too complicated and it is only the PLT that the linker should care about, getting XO right is the users responsibility.

In that case it may simplify to

Unconditionally add SHF_PURECODE to the .plt

If at least one executable OutputSection has non-XO, then find the OutputSection containing the PLT (.in.plt->parent) and clear SHF_PURECODE from that OutputSection.

Again this could be done at the start of createPhdrs().

Thanks for the thorough reply! It really helped refine my understanding of the problem.

You're right that the main use case we care about is the whole program being XO or RX. What do you think about doing the following:

Unconditionally set SHF_AARCH64_PURECODE for .plt.

When auto generating program headers, consider XO and RX sections compatible, allowing them to be placed in the same segment. We could also add a flag similar to --rosegment to control this behaviour.

At this point we don't need to strip the PURECODE flag from the output sections, they'll just be placed in a program header that is RX instead of XO. Leaving the section flag intact shouldn't cause any issues I think.

We can do this by just adding the following snippet in createPhdrs():

if (newFlags & PF_X) incompatible &= ~PF_R;

For bare-metal targets, this wouldn't allow separate auto-generated program headers with XO and RX code though, a linker script (or just a flag?) would be required to separate those out into different program headers. I don't have any experience working with bare-metal, do you think this is a reasonable requirement? If not, can we detect in the linker whether we're linking for a target with an OS or not?

If you think this would be a good approach, I'll open a separate PR superseding this one, as it's a more general solution.

For the bare metal case with linker script I think that would be OK. I expect that in a majority of cases a MEMORY region would be setup for the XO and non-XO memory. These would have distinct addresses such that a separate program header would be created anyway. If it weren't then PHDRS could be used to force the separation.

We'd need to release note the change in behaviour but I think that it is worth it to get the merging case right.

I'll open a separate PR then with my proposed approach. Thank you for your insights!

lld/test/ELF/aarch64-execute-only-plt.s

…t` flag Following from the discussion in llvm#132224, this seems like the best approach to deal with a mix of XO and RX output sections in the same binary. This change will also simplify the implementation of the PURECODE section flag for AArch64. To control this behaviour, the `--[no-]xosegment` flag is added to LLD (similarly to `--[no-]rosegment`), which determines whether to allow merging XO and RX sections in the same segment. The default value is `--no-xosegment`, which is a breaking change compared to the previous behaviour. Release notes are also added, since this will be a breaking change.

Il-Capitano · 2025-03-21T15:43:37Z

I opened #132412 as a general approach of dealing with a mix of XO and RX sections in the same binary. I'll close this PR because of that. I'll do a separate change regarding the section flags for .plt and .iplt.

…t` flag Following from the discussion in llvm#132224, this seems like the best approach to deal with a mix of XO and RX output sections in the same binary. This change will also simplify the implementation of the PURECODE section flag for AArch64. To control this behaviour, the `--[no-]xosegment` flag is added to LLD (similarly to `--[no-]rosegment`), which determines whether to allow merging XO and RX sections in the same segment. The default value is `--no-xosegment`, which is a breaking change compared to the previous behaviour. Release notes are also added, since this will be a breaking change.

…t` flag (#132412) Following from the discussion in #132224, this seems like the best approach to deal with a mix of XO and RX output sections in the same binary. This change will also simplify the implementation of the PURECODE section flag for AArch64. To control this behaviour, the `--[no-]xosegment` flag is added to LLD (similarly to `--[no-]rosegment`), which determines whether to allow merging XO and RX sections in the same segment. The default value is `--no-xosegment`, which is a breaking change compared to the previous behaviour. Release notes are also added, since this will be a breaking change.

llvmbot added lld lld:ELF labels Mar 20, 2025

Il-Capitano requested review from smithp35 and MaskRay March 20, 2025 14:52

smithp35 reviewed Mar 20, 2025

View reviewed changes

MaskRay reviewed Mar 21, 2025

View reviewed changes

lld/test/ELF/aarch64-execute-only-plt.s Outdated Show resolved Hide resolved

MaskRay reviewed Mar 21, 2025

View reviewed changes

lld/test/ELF/aarch64-execute-only-plt.s Outdated Show resolved Hide resolved

Update test according to review feedback

c861972

Il-Capitano mentioned this pull request Mar 21, 2025

[LLD][ELF] Allow merging XO and RX sections, and add --[no-]xosegment flag #132412

Merged

Il-Capitano closed this Mar 21, 2025

Il-Capitano deleted the execute-only-plt branch March 21, 2025 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLD][AArch64] Mark .plt with PURECODE flag if all input sections also have it #132224

[LLD][AArch64] Mark .plt with PURECODE flag if all input sections also have it #132224

Uh oh!

Il-Capitano commented Mar 20, 2025

Uh oh!

llvmbot commented Mar 20, 2025

Uh oh!

llvmbot commented Mar 20, 2025

Uh oh!

Il-Capitano commented Mar 20, 2025

Uh oh!

smithp35 left a comment

Uh oh!

smithp35 Mar 20, 2025

Uh oh!

Il-Capitano Mar 21, 2025 •

edited

Loading

Uh oh!

smithp35 Mar 21, 2025

Uh oh!

Il-Capitano Mar 21, 2025 •

edited

Loading

Uh oh!

smithp35 Mar 21, 2025

Uh oh!

Il-Capitano Mar 21, 2025

Uh oh!

Uh oh!

Uh oh!

Il-Capitano commented Mar 21, 2025

Uh oh!

Uh oh!

	// For further --emit-reloc handling code we need target output section
	// to be created before we create relocation output section, so we want
	// to create target sections first. We do not want priority handling
	// for synthetic sections because them are special.
	size_t n = 0;
	for (InputSectionBase *isec : ctx.inputSections) {
	// Process InputSection and MergeInputSection.
	if (LLVM_LIKELY(isa<InputSection>(isec)))
	ctx.inputSections[n++] = isec;

	// In -r links, SHF_LINK_ORDER sections are added while adding their parent
	// sections because we need to know the parent's output section before we
	// can select an output section for the SHF_LINK_ORDER section.
	if (ctx.arg.relocatable && (isec->flags & SHF_LINK_ORDER))
	continue;

	if (auto *sec = dyn_cast<InputSection>(isec))
	if (InputSectionBase *rel = sec->getRelocatedSection())
	if (auto *relIS = dyn_cast_or_null<InputSectionBase>(rel->parent))
	add(relIS);
	add(isec);
	if (ctx.arg.relocatable)
	for (InputSectionBase *depSec : isec->dependentSections)
	if (depSec->flags & SHF_LINK_ORDER)
	add(depSec);
	}

[LLD][AArch64] Mark .plt with PURECODE flag if all input sections also have it #132224

[LLD][AArch64] Mark .plt with PURECODE flag if all input sections also have it #132224

Uh oh!

Conversation

Il-Capitano commented Mar 20, 2025

Uh oh!

llvmbot commented Mar 20, 2025

Uh oh!

llvmbot commented Mar 20, 2025

Uh oh!

Il-Capitano commented Mar 20, 2025

Uh oh!

smithp35 left a comment

Choose a reason for hiding this comment

Uh oh!

smithp35 Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

Il-Capitano Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smithp35 Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

Il-Capitano Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smithp35 Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

Il-Capitano Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Il-Capitano commented Mar 21, 2025

Uh oh!

Uh oh!

Il-Capitano Mar 21, 2025 •

edited

Loading

Il-Capitano Mar 21, 2025 •

edited

Loading