Skip to content

[LLD][COFF] Emit tail merge pdata for delay load thunks on ARM64EC #116810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Nov 19, 2024

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-coff

@llvm/pr-subscribers-platform-windows

Author: Jacek Caban (cjacek)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/116810.diff

2 Files Affected:

  • (modified) lld/COFF/DLL.cpp (+4)
  • (modified) lld/test/COFF/arm64ec-delayimport.test (+8-5)
diff --git a/lld/COFF/DLL.cpp b/lld/COFF/DLL.cpp
index 797d9f1490253a..0f6a40a41ca00f 100644
--- a/lld/COFF/DLL.cpp
+++ b/lld/COFF/DLL.cpp
@@ -395,6 +395,7 @@ class TailMergePDataChunkX64 : public NonSectionChunk {
   }
 
   size_t getSize() const override { return 3 * sizeof(uint32_t); }
+  MachineTypes getMachine() const override { return AMD64; }
 
   void writeTo(uint8_t *buf) const override {
     write32le(buf + 0, tm->getRVA()); // TailMergeChunk start RVA
@@ -415,6 +416,7 @@ class TailMergeUnwindInfoX64 : public NonSectionChunk {
   }
 
   size_t getSize() const override { return sizeof(tailMergeUnwindInfoX64); }
+  MachineTypes getMachine() const override { return AMD64; }
 
   void writeTo(uint8_t *buf) const override {
     memcpy(buf, tailMergeUnwindInfoX64, sizeof(tailMergeUnwindInfoX64));
@@ -882,6 +884,7 @@ Chunk *DelayLoadContents::newTailMergeChunk(Chunk *dir) {
 Chunk *DelayLoadContents::newTailMergeUnwindInfoChunk() {
   switch (ctx.config.machine) {
   case AMD64:
+  case ARM64EC:
     return make<TailMergeUnwindInfoX64>();
     // FIXME: Add support for other architectures.
   default:
@@ -891,6 +894,7 @@ Chunk *DelayLoadContents::newTailMergeUnwindInfoChunk() {
 Chunk *DelayLoadContents::newTailMergePDataChunk(Chunk *tm, Chunk *unwind) {
   switch (ctx.config.machine) {
   case AMD64:
+  case ARM64EC:
     return make<TailMergePDataChunkX64>(tm, unwind);
     // FIXME: Add support for other architectures.
   default:
diff --git a/lld/test/COFF/arm64ec-delayimport.test b/lld/test/COFF/arm64ec-delayimport.test
index 0c8009362f80e7..1e0bd899ba323d 100644
--- a/lld/test/COFF/arm64ec-delayimport.test
+++ b/lld/test/COFF/arm64ec-delayimport.test
@@ -12,9 +12,9 @@ RUN: lld-link -machine:arm64ec -dll -noentry -out:out.dll loadconfig-arm64ec.obj
 RUN:          helper-mangled.obj test-arm64ec.lib test2-arm64ec.lib -delayload:test.dll -map
 
 RUN: llvm-readobj --hex-dump=.test out.dll | FileCheck --check-prefix=TESTSEC %s
-TESTSEC:      0x180009000 00600000 88700000 00200000 10100000
-TESTSEC-NEXT: 0x180009010 08600000 90700000 10200000 30100000
-TESTSEC-NEXT: 0x180009020 1c100000 3c100000 00300000
+TESTSEC:      0x18000a000 00600000 88700000 00200000 10100000
+TESTSEC-NEXT: 0x18000a010 08600000 90700000 10200000 30100000
+TESTSEC-NEXT: 0x18000a020 1c100000 3c100000 00300000
 
 RUN: llvm-objdump -d out.dll | FileCheck --check-prefix=DISASM %s
 DISASM:      0000000180001000 <.text>:
@@ -61,7 +61,7 @@ DISASM-NEXT: 18000203d: 66 0f 7f 4c 24 10            movdqa  %xmm1, 0x10(%rsp)
 DISASM-NEXT: 180002043: 66 0f 7f 54 24 20            movdqa  %xmm2, 0x20(%rsp)
 DISASM-NEXT: 180002049: 66 0f 7f 5c 24 30            movdqa  %xmm3, 0x30(%rsp)
 DISASM-NEXT: 18000204f: 48 8b d0                     movq    %rax, %rdx
-DISASM-NEXT: 180002052: 48 8d 0d 97 21 00 00         leaq    0x2197(%rip), %rcx      # 0x1800041f0
+DISASM-NEXT: 180002052: 48 8d 0d a7 21 00 00         leaq    0x21a7(%rip), %rcx      # 0x180004200
 DISASM-NEXT: 180002059: e8 aa ef ff ff               callq   0x180001008 <.text+0x8>
 DISASM-NEXT: 18000205e: 66 0f 6f 04 24               movdqa  (%rsp), %xmm0
 DISASM-NEXT: 180002063: 66 0f 6f 4c 24 10            movdqa  0x10(%rsp), %xmm1
@@ -85,7 +85,7 @@ IMPORTS-NEXT:   Name: test.dll
 IMPORTS-NEXT:   Attributes: 0x1
 IMPORTS-NEXT:   ModuleHandle: 0x7080
 IMPORTS-NEXT:   ImportAddressTable: 0x7088
-IMPORTS-NEXT:   ImportNameTable: 0x4230
+IMPORTS-NEXT:   ImportNameTable: 0x4240
 IMPORTS-NEXT:   BoundDelayImportTable: 0x0
 IMPORTS-NEXT:   UnloadDelayImportTable: 0x0
 IMPORTS-NEXT:   Import {
@@ -140,6 +140,9 @@ RELOC-NEXT:     Type: DIR64
 RELOC-NEXT:     Address: 0x6008
 RELOC-NEXT:   }
 
+RUN: llvm-readobj --hex-dump=.pdata out.dll | FileCheck --check-prefix=PDATA %s
+PDATA: 0x180008000 2e200000 81200000 18400000
+
 Verify that a demangled version of __delayLoadHelper2 can be used.
 
 RUN: lld-link -machine:arm64ec -dll -noentry -out:out2.dll loadconfig-arm64ec.obj test.obj \

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks reasonable and straightforward.

@@ -395,6 +395,7 @@ class TailMergePDataChunkX64 : public NonSectionChunk {
}

size_t getSize() const override { return 3 * sizeof(uint32_t); }
MachineTypes getMachine() const override { return AMD64; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this bit needed here (and the other addition of getMachine() as well), or more precisely - why wasn't it needed before and why is it needed now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARM64EC targets require an additional operation in Writer::mergeSections to separate ARM and x86 chunks. Other targets do not consider the machine type of .pdata entries, so this was unnecessary prior to this PR.

I added getMachine to TailMergeUnwindInfoX64 for consistency, though it could be skipped.

@cjacek cjacek merged commit c97478c into llvm:main Nov 20, 2024
12 checks passed
@cjacek cjacek deleted the delayload-pdata branch November 20, 2024 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants