Skip to content

[LLD][COFF] Initial support for ARM64EC importlibs. #107164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 4, 2024

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Sep 3, 2024

Use demangled symbol name for _imp symbols and define demangled thunk symbol as AMD64 thunk.

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2024

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-coff

@llvm/pr-subscribers-platform-windows

Author: Jacek Caban (cjacek)

Changes

Use demangled symbol name for _imp symbols and define demangled thunk symbol as AMD64 thunk.


Full diff: https://github.com/llvm/llvm-project/pull/107164.diff

5 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (+18-5)
  • (modified) lld/COFF/InputFiles.h (+1-1)
  • (modified) lld/COFF/SymbolTable.cpp (+2-2)
  • (modified) lld/COFF/SymbolTable.h (+1-1)
  • (added) lld/test/COFF/arm64ec-import.test (+68)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index e1ea4ebeabc9b8..fb86fe2cd35ad0 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -25,6 +25,7 @@
 #include "llvm/DebugInfo/CodeView/TypeDeserializer.h"
 #include "llvm/DebugInfo/PDB/Native/NativeSession.h"
 #include "llvm/DebugInfo/PDB/Native/PDBFile.h"
+#include "llvm/IR/Mangler.h"
 #include "llvm/LTO/LTO.h"
 #include "llvm/Object/Binary.h"
 #include "llvm/Object/COFF.h"
@@ -1019,9 +1020,16 @@ void ImportFile::parse() {
 
   // Read names and create an __imp_ symbol.
   StringRef buf = mb.getBuffer().substr(sizeof(*hdr));
-  StringRef name = saver().save(buf.split('\0').first);
+  StringRef nameBuf = buf.split('\0').first, name;
+  if (isArm64EC(hdr->Machine)) {
+    if (std::optional<std::string> demangledName =
+            getArm64ECDemangledFunctionName(nameBuf))
+      name = saver().save(*demangledName);
+  }
+  if (name.empty())
+    name = saver().save(nameBuf);
   StringRef impName = saver().save("__imp_" + name);
-  buf = buf.substr(name.size() + 1);
+  buf = buf.substr(nameBuf.size() + 1);
   dllName = buf.split('\0').first;
   StringRef extName;
   switch (hdr->getNameType()) {
@@ -1058,9 +1066,14 @@ void ImportFile::parse() {
   // If type is function, we need to create a thunk which jump to an
   // address pointed by the __imp_ symbol. (This allows you to call
   // DLL functions just like regular non-DLL functions.)
-  if (hdr->getType() == llvm::COFF::IMPORT_CODE)
-    thunkSym = ctx.symtab.addImportThunk(
-        name, cast_or_null<DefinedImportData>(impSym), hdr->Machine);
+  if (hdr->getType() == llvm::COFF::IMPORT_CODE) {
+    if (ctx.config.machine != ARM64EC) {
+      thunkSym = ctx.symtab.addImportThunk(name, impSym, hdr->Machine);
+    } else {
+      thunkSym = ctx.symtab.addImportThunk(name, impSym, AMD64);
+      // FIXME: Add aux IAT symbols.
+    }
+  }
 }
 
 BitcodeFile::BitcodeFile(COFFLinkerContext &ctx, MemoryBufferRef mb,
diff --git a/lld/COFF/InputFiles.h b/lld/COFF/InputFiles.h
index a332ac87b265e6..8b3303a8d87f45 100644
--- a/lld/COFF/InputFiles.h
+++ b/lld/COFF/InputFiles.h
@@ -346,7 +346,7 @@ class ImportFile : public InputFile {
   static bool classof(const InputFile *f) { return f->kind() == ImportKind; }
   MachineTypes getMachineType() const override;
 
-  Symbol *impSym = nullptr;
+  DefinedImportData *impSym = nullptr;
   Symbol *thunkSym = nullptr;
   std::string dllName;
 
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index a5f155bc05bc9e..bb7583bb9a7df4 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -771,12 +771,12 @@ Symbol *SymbolTable::addCommon(InputFile *f, StringRef n, uint64_t size,
   return s;
 }
 
-Symbol *SymbolTable::addImportData(StringRef n, ImportFile *f) {
+DefinedImportData *SymbolTable::addImportData(StringRef n, ImportFile *f) {
   auto [s, wasInserted] = insert(n, nullptr);
   s->isUsedInRegularObj = true;
   if (wasInserted || isa<Undefined>(s) || s->isLazy()) {
     replaceSymbol<DefinedImportData>(s, n, f);
-    return s;
+    return cast<DefinedImportData>(s);
   }
 
   reportDuplicate(s, f);
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index b5f95d2ad7f112..51c6c79ec14463 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -102,7 +102,7 @@ class SymbolTable {
   Symbol *addCommon(InputFile *f, StringRef n, uint64_t size,
                     const llvm::object::coff_symbol_generic *s = nullptr,
                     CommonChunk *c = nullptr);
-  Symbol *addImportData(StringRef n, ImportFile *f);
+  DefinedImportData *addImportData(StringRef n, ImportFile *f);
   Symbol *addImportThunk(StringRef name, DefinedImportData *s,
                          uint16_t machine);
   void addLibcall(StringRef name);
diff --git a/lld/test/COFF/arm64ec-import.test b/lld/test/COFF/arm64ec-import.test
new file mode 100644
index 00000000000000..2a80a30910b5c7
--- /dev/null
+++ b/lld/test/COFF/arm64ec-import.test
@@ -0,0 +1,68 @@
+REQUIRES: aarch64, x86
+RUN: split-file %s %t.dir && cd %t.dir
+
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows test.s -o test.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows %S/Inputs/loadconfig-arm64ec.s -o loadconfig-arm64ec.obj
+RUN: llvm-lib -machine:arm64ec -def:test.def -out:test-arm64ec.lib
+RUN: llvm-lib -machine:arm64ec -def:test2.def -out:test2-arm64ec.lib
+RUN: llvm-lib -machine:x64 -def:test.def -out:test-x86_64.lib
+
+Link using ARM64EC import library:
+RUN: lld-link -machine:arm64ec -dll -noentry -out:out.dll loadconfig-arm64ec.obj \
+RUN:          test.obj test-arm64ec.lib test2-arm64ec.lib
+
+Link using x86_64 import library:
+RUN: lld-link -machine:arm64ec -dll -noentry -out:out2.dll loadconfig-arm64ec.obj \
+RUN:          test.obj test-x86_64.lib test2-arm64ec.lib
+
+RUN: llvm-readobj --coff-imports out.dll | FileCheck --check-prefix=IMPORTS %s
+RUN: llvm-readobj --coff-imports out2.dll | FileCheck --check-prefix=IMPORTS %s
+IMPORTS:      Import {
+IMPORTS-NEXT:   Name: test.dll
+IMPORTS-NEXT:   ImportLookupTableRVA:
+IMPORTS-NEXT:   ImportAddressTableRVA: 0x2258
+IMPORTS-NEXT:   Symbol: data (0)
+IMPORTS-NEXT:   Symbol: func (0)
+IMPORTS-NEXT:   Symbol: func2 (0)
+IMPORTS-NEXT: }
+IMPORTS-NEXT: Import {
+IMPORTS-NEXT:   Name: test2.dll
+IMPORTS-NEXT:   ImportLookupTableRVA:
+IMPORTS-NEXT:   ImportAddressTableRVA: 0x2278
+IMPORTS-NEXT:   Symbol: t2func (0)
+IMPORTS-NEXT: }
+
+RUN: llvm-objdump -d out.dll | FileCheck --check-prefix=DISASM %s
+RUN: llvm-objdump -d out2.dll | FileCheck --check-prefix=DISASM %s
+
+DISASM:      0000000180001000 <.text>:
+DISASM-NEXT: 180001000: ff 25 5a 12 00 00            jmpq    *0x125a(%rip)           # 0x180002260
+
+RUN: llvm-readobj --hex-dump=.test out.dll | FileCheck --check-prefix=TESTSEC %s
+RUN: llvm-readobj --hex-dump=.test out2.dll | FileCheck --check-prefix=TESTSEC %s
+TESTSEC:      0x180004000 60220000 58220000 68220000 78220000
+TESTSEC-NEXT: 0x180004010 00100000
+
+#--- test.s
+    .section .test, "r"
+    .globl arm64ec_data_sym
+    .p2align 2, 0x0
+arm64ec_data_sym:
+    .rva __imp_func
+    .rva __imp_data
+    .rva __imp_func2
+    .rva __imp_t2func
+    .rva func
+
+#--- test.def
+NAME test.dll
+EXPORTS
+    data DATA
+    func
+    func2
+    unused_func
+
+#--- test2.def
+NAME test2.dll
+EXPORTS
+    t2func

@cjacek
Copy link
Contributor Author

cjacek commented Sep 3, 2024

Depends on #107162.

This PR introduces a basic importlib test, which is a step towards supporting ARM64EC. However, additional ARM64EC-specific features will be required for full compatibility.

In this PR, the import thunk uses the AMD64 machine type. While this is appropriate for demangled symbols, we also need to implement an ARM64EC variant that handles mangled symbols.

Additionally, there's auxiliary IAT handling that requires more symbols (especially __imp_aux_* symbols) and a larger number of chunks overall. For more context, I’ve pushed a draft here, and I’ve rebased my arm64ec branch on top of it.

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM here, but a couple of minor comments.

@@ -1019,9 +1020,16 @@ void ImportFile::parse() {

// Read names and create an __imp_ symbol.
StringRef buf = mb.getBuffer().substr(sizeof(*hdr));
StringRef name = saver().save(buf.split('\0').first);
StringRef nameBuf = buf.split('\0').first, name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically, I dislike the type var = initialization, otherVar; style - if there's an initialization of one variable (and a nontrivial one in this case) I find it more readable to have the other variable on a separate line, or perhaps reordered so that the pure name without an assignment first.

StringRef nameBuf = buf.split('\0').first, name;
if (isArm64EC(hdr->Machine)) {
if (std::optional<std::string> demangledName =
getArm64ECDemangledFunctionName(nameBuf))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this patch, but I took the time to read the getArm64ECDemangledFunctionName and getArm64ECMangledFunctionName implementations. I think both of them would benefit of some code comments explaining what the actual cases handled are, and what the code really does.

Plus, within getArm64ECMangledFunctionName, the Prefix variable feels misnamed to me, as this doesn't appear at the start of a string, but is injected in the middle of a string.

Use demangled symbol name for __imp_ symbols and define demangled thunk
symbol as AMD64 thunk.
Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now!

@cjacek cjacek merged commit 3d53212 into llvm:main Sep 4, 2024
8 checks passed
@cjacek cjacek deleted the arm64ec-implib branch September 4, 2024 13:03
@cjacek
Copy link
Contributor Author

cjacek commented Sep 4, 2024

I pushed and created a follow-up for mangling helpers clarity, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants