-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LLD][COFF] Generate redirection metadata for custom ARM64EC export thunks. #105901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-coff Author: Jacek Caban (cjacek) ChangesThis allows using custom export thunks instead of default generated ones. This is useful for performance in cases where transfering between JIT and ARM64EC code is more expensive than just emulating the whole function (but it's still useful to have ARM64EC version so that ARM64EC callers don't call into the emulator). It's also useful for compatibility, where applications have specific expectations about function contents (like syscall functions). Full diff: https://github.com/llvm/llvm-project/pull/105901.diff 2 Files Affected:
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 0360e186ecf0cf..5fc05d7556f4f8 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -2062,6 +2062,20 @@ void Writer::createECChunks() {
if (auto thunk = dyn_cast<ECExportThunkChunk>(sym->getChunk())) {
hexpthkSec->addChunk(thunk);
exportThunks.push_back({thunk, thunk->target});
+ } else if (auto def = dyn_cast<DefinedRegular>(sym)) {
+ // Allow section chunk to be treated as an export thunk if it looks like
+ // one.
+ SectionChunk *chunk = def->getChunk();
+ if (!chunk->live || chunk->getMachine() != AMD64)
+ continue;
+ assert(sym->getName().starts_with("EXP+"));
+ StringRef targetName = sym->getName().substr(strlen("EXP+"));
+ Symbol *targetSym = ctx.symtab.find((targetName + "$hp_target").str());
+ if (!targetSym)
+ targetSym = ctx.symtab.find(targetName);
+ Defined *t = dyn_cast_or_null<Defined>(targetSym);
+ if (t && isArm64EC(t->getChunk()->getMachine()))
+ exportThunks.push_back({chunk, t});
}
}
diff --git a/lld/test/COFF/arm64ec-cust-export-thunk.s b/lld/test/COFF/arm64ec-cust-export-thunk.s
new file mode 100644
index 00000000000000..e56f34c8752173
--- /dev/null
+++ b/lld/test/COFF/arm64ec-cust-export-thunk.s
@@ -0,0 +1,82 @@
+# REQUIRES: aarch64, x86
+# RUN: split-file %s %t.dir && cd %t.dir
+
+# Test that metadata is generated when a custom export thunk is supplied.
+
+# RUN: llvm-mc -filetype=obj -triple=arm64ec-windows func.s -o func.obj
+# RUN: llvm-mc -filetype=obj -triple=arm64ec-windows hp-func.s -o hp-func.obj
+# RUN: llvm-mc -filetype=obj -triple=x86_64-windows thunk.s -o thunk.obj
+# RUN: llvm-mc -filetype=obj -triple=arm64ec-windows %S/Inputs/loadconfig-arm64ec.s -o loadconfig-arm64ec.obj
+
+# RUN: lld-link -out:out.dll -machine:arm64ec func.obj thunk.obj loadconfig-arm64ec.obj -dll -noentry "-export:#func,EXPORTAS,func"
+
+# RUN: llvm-objdump -d out.dll | FileCheck --check-prefixes=DISASM,DISASM-EXP %s
+# DISASM: Disassembly of section .text:
+# DISASM-EMPTY:
+# DISASM-NEXT: 0000000180001000 <.text>:
+# DISASM-NEXT: 180001000: 52800040 mov w0, #0x2 // =2
+# DISASM-NEXT: 180001004: d65f03c0 ret
+# DISASM-NEXT: ...
+# DISASM-EXP-EMPTY:
+# DISASM-EXP-NEXT: 0000000180002000 <func>:
+# DISASM-NEXT: 180002000: b8 03 00 00 00 movl $0x3, %eax
+# DISASM-NEXT: 180002005: c3 retq
+
+# RUN: llvm-objdump -p out.dll | FileCheck --check-prefix=EXPORT %s
+# EXPORT: Ordinal RVA Name
+# EXPORT-NEXT: 1 0x2000 func
+
+# RUN: llvm-readobj --coff-load-config out.dll | FileCheck --check-prefix=CHPE %s
+# CHPE: CodeMap [
+# CHPE-NEXT: 0x1000 - 0x1008 ARM64EC
+# CHPE-NEXT: 0x2000 - 0x2006 X64
+# CHPE-NEXT: ]
+# CHPE-NEXT: CodeRangesToEntryPoints [
+# CHPE-NEXT: 0x2000 - 0x2006 -> 0x2000
+# CHPE-NEXT: ]
+# CHPE-NEXT: RedirectionMetadata [
+# CHPE-NEXT: 0x2000 -> 0x1000
+# CHPE-NEXT: ]
+
+# RUN: lld-link -out:out2.dll -machine:arm64ec hp-func.obj thunk.obj loadconfig-arm64ec.obj -dll -noentry
+# RUN: llvm-objdump -d out2.dll | FileCheck --check-prefix=DISASM %s
+# RUN: llvm-readobj --coff-load-config out2.dll | FileCheck --check-prefix=CHPE %s
+
+#--- func.s
+ .globl "#func"
+ .p2align 2, 0x0
+"#func":
+ mov w0, #2
+ ret
+
+#--- hp-func.s
+ .section .text,"xr",discard,"#func$hp_target"
+ .globl "#func$hp_target"
+ .p2align 2, 0x0
+"#func$hp_target":
+ mov w0, #2
+ ret
+
+ .def "EXP+#func"
+ .scl 2
+ .type 32
+ .endef
+ .weak func
+.set func, "EXP+#func"
+ .weak "#func"
+.set "#func", "#func$hp_target"
+
+ .data
+ .rva func
+
+#--- thunk.s
+ .def "EXP+#func"
+ .scl 2
+ .type 32
+ .endef
+ .section .wowthk$aa,"xr",discard,"EXP+#func"
+ .globl "EXP+#func"
+ .p2align 2, 0x0
+"EXP+#func":
+ movl $3, %eax
+ retq
|
A more detailed explanation of a scenario where this impacts performance can be found in this blog article (search for "EXP+#"). During my experimentation, I had limited success trying to get the MSVC linker to generate executables with the required thunks. While it is possible to pass these thunks similarly to how they are handled in tests, this approach seems less flexible than the implementation I’ve provided here. For instance, exporting the thunks leads to a duplicated symbol error, which I worked around by using the DATA export type to prevent the linker from generating another thunk. However, I wasn’t able to find a way to control the redirection target directly. It appears that MSVC linker reads the contents at an offset expected in EXExportThunkCode and blindly uses that as if it were a generated thunk. This patch, on the other hand, uses symbol lookup, ensuring consistent behavior akin to how generated code would function. It identifies and treats any thunk chunk resembling an export thunk accordingly. It depends on #105897 for a corner case when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, thanks!
StringRef targetName = sym->getName().substr(strlen("EXP+")); | ||
Symbol *targetSym = ctx.symtab.find((targetName + "$hp_target").str()); | ||
if (!targetSym) | ||
targetSym = ctx.symtab.find(targetName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment here, that for an export thunk EXP+#foo
, we're first looking for #foo$hp_target
, but secondarily also for #foo
. (I was wondering about this and scratching my head, overlooking the fact that we do look for the unsuffixed version two lines below...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I added a comment and merged. Thanks for reviews!
…hunks. This allows using custom export thunks instead of default generated ones. This is useful for performance in cases where transfering between JIT and ARM64EC code is more expensive than just emulating the whole function (but it's still useful to have ARM64EC version so that ARM64EC callers don't call into the emulator). It's also useful for compatibility, where applications have specific expectations about function contents (like syscall functions).
2c7be0a
to
539517c
Compare
This allows using custom export thunks instead of default generated ones. This is useful for performance in cases where transfering between JIT and ARM64EC code is more expensive than just emulating the whole function (but it's still useful to have ARM64EC version so that ARM64EC callers don't call into the emulator). It's also useful for compatibility, where applications have specific expectations about function contents (like syscall functions).