Skip to content

[LLD][COFF] Split native and EC .CRT chunks on ARM64X #127203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions lld/COFF/Writer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -403,6 +403,12 @@ void OutputSection::addContributingPartialSection(PartialSection *sec) {
contribSections.push_back(sec);
}

void OutputSection::splitECChunks() {
llvm::stable_sort(chunks, [=](const Chunk *a, const Chunk *b) {
return (a->getMachine() != ARM64) < (b->getMachine() != ARM64);
});
}

// Check whether the target address S is in range from a relocation
// of type relType at address P.
bool Writer::isInRange(uint16_t relType, uint64_t s, uint64_t p, int margin,
Expand Down Expand Up @@ -1156,6 +1162,11 @@ void Writer::createSections() {
sec->addContributingPartialSection(pSec);
}

if (ctx.hybridSymtab) {
if (OutputSection *sec = findSection(".CRT"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, tricky. I guess we'd need to do something similar for every section that contains sorted ranges of pointers like this - presumably .ctors too?

What about sections like .tls? I presume that's coming up in a future patch too :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'd need to do something similar for every section that contains sorted ranges of pointers like this - presumably .ctors too?

Yes, that's #127205.

What about sections like .tls? I presume that's coming up in a future patch too :-)

Surprisingly, no, .tls is a bit unusual. TLS callbacks are part of the .CRT section, so this PR already handles that.

The .tls section would seem like a good candidate for splitting, but based on my MSVC testing, it doesn’t behave that way. It’s somewhat similar to regular data handling, where EC data is mixed with native data. Since TLS callbacks are referenced by the TLS directory entry, I expected it to use ARM64X relocations in some way (for example, by defining _tls_used in both namespaces and swapping them in the directory entry using ARM64X relocs) but that doesn’t happen either.

With ARM64X, MSVC emits only the native directory. I guess that’s sufficient because the CRT later uses .tls and .CRT symbols without relying on _tls_used at runtime. That's already how LLD behaves, though I wouldn't rule out trying to improve it at some point...

sec->splitECChunks();
}

// Finally, move some output sections to the end.
auto sectionOrder = [&](const OutputSection *s) {
// Move DISCARDABLE (or non-memory-mapped) sections to the end of file
Expand Down
3 changes: 3 additions & 0 deletions lld/COFF/Writer.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ class OutputSection {
void writeHeaderTo(uint8_t *buf, bool isDebug);
void addContributingPartialSection(PartialSection *sec);

// Sort chunks to split native and EC sections on hybrid targets.
void splitECChunks();

// Returns the size of this section in an executable memory image.
// This may be smaller than the raw size (the raw size is multiple
// of disk sector size, so there may be padding at end), or may be
Expand Down
42 changes: 42 additions & 0 deletions lld/test/COFF/arm64x-crt-sec.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
// REQUIRES: aarch64, x86
// RUN: split-file %s %t.dir && cd %t.dir

// RUN: llvm-mc -filetype=obj -triple=aarch64-windows crt1-arm64.s -o crt1-arm64.obj
// RUN: llvm-mc -filetype=obj -triple=aarch64-windows crt2-arm64.s -o crt2-arm64.obj
// RUN: llvm-mc -filetype=obj -triple=arm64ec-windows crt1-arm64ec.s -o crt1-arm64ec.obj
// RUN: llvm-mc -filetype=obj -triple=x86_64-windows crt2-amd64.s -o crt2-amd64.obj

// Check that .CRT chunks are correctly sorted and that EC and native chunks are split.

// RUN: lld-link -out:out.dll -machine:arm64x -dll -noentry crt1-arm64.obj crt2-arm64.obj crt1-arm64ec.obj crt2-amd64.obj
// RUN: llvm-readobj --hex-dump=.CRT out.dll | FileCheck %s

// RUN: lld-link -out:out2.dll -machine:arm64x -dll -noentry crt1-arm64.obj crt1-arm64ec.obj crt2-arm64.obj crt2-amd64.obj
// RUN: llvm-readobj --hex-dump=.CRT out2.dll | FileCheck %s

// RUN: lld-link -out:out3.dll -machine:arm64x -dll -noentry crt2-amd64.obj crt1-arm64ec.obj crt2-arm64.obj crt1-arm64.obj
// RUN: llvm-readobj --hex-dump=.CRT out3.dll | FileCheck %s

// CHECK: 0x180002000 01000000 00000000 02000000 00000000
// CHECK-NEXT: 0x180002010 03000000 00000000 11000000 00000000
// CHECK-NEXT: 0x180002020 12000000 00000000 13000000 00000000

#--- crt1-arm64.s
.section .CRT$A,"dr"
.xword 1
.section .CRT$Z,"dr"
.xword 3

#--- crt2-arm64.s
.section .CRT$B,"dr"
.xword 2

#--- crt1-arm64ec.s
.section .CRT$A,"dr"
.xword 0x11
.section .CRT$Z,"dr"
.xword 0x13

#--- crt2-amd64.s
.section .CRT$B,"dr"
.quad 0x12