Skip to content

[LLD][COFF] Add support for x86_64 archives on ARM64X #128241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 22, 2025

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Feb 21, 2025

If the ECSYMBOLS section is missing in the archive, the archive could be either a native-only ARM64 or x86_64 archive. Check the machine type of the object containing a symbol to determine which symbol table to use.

@llvmbot
Copy link
Member

llvmbot commented Feb 21, 2025

@llvm/pr-subscribers-lld

Author: Jacek Caban (cjacek)

Changes

If the ECSYMBOLS section is missing in the archive, the archive could be either a native-only ARM64 or x86_64 archive. Check the machine type of the object containing a symbol to determine which symbol table to use.


Full diff: https://github.com/llvm/llvm-project/pull/128241.diff

2 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (+42-1)
  • (modified) lld/test/COFF/arm64x-symtab.s (+32)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 7b105fb4c17a2..bb9f1407b51f7 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -29,6 +29,7 @@
 #include "llvm/LTO/LTO.h"
 #include "llvm/Object/Binary.h"
 #include "llvm/Object/COFF.h"
+#include "llvm/Object/COFFImportFile.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/Error.h"
@@ -122,6 +123,8 @@ ArchiveFile::ArchiveFile(COFFLinkerContext &ctx, MemoryBufferRef m)
 
 void ArchiveFile::parse() {
   COFFLinkerContext &ctx = symtab.ctx;
+  SymbolTable *archiveSymtab = &symtab;
+
   // Parse a MemoryBufferRef as an archive file.
   file = CHECK(Archive::create(mb), this);
 
@@ -136,12 +139,50 @@ void ArchiveFile::parse() {
       // Read both EC and native symbols on ARM64X.
       if (!ctx.hybridSymtab)
         return;
+    } else if (ctx.hybridSymtab) {
+      // If the ECSYMBOLS section is missing in the archive, the archive could
+      // be either a native-only ARM64 or x86_64 archive. Check the machine type
+      // of the object containing a symbol to determine which symbol table to
+      // use.
+      Archive::symbol_iterator sym = file->symbol_begin();
+      if (sym != file->symbol_end()) {
+        MachineTypes machine = IMAGE_FILE_MACHINE_UNKNOWN;
+        Archive::Child child =
+            CHECK(sym->getMember(),
+                  file->getFileName() +
+                      ": could not get the buffer for a child of the archive");
+        MemoryBufferRef mb = CHECK(
+            child.getMemoryBufferRef(),
+            file->getFileName() +
+                ": could not get the buffer for a child buffer of the archive");
+        switch (identify_magic(mb.getBuffer())) {
+        case file_magic::coff_object: {
+          std::unique_ptr<COFFObjectFile> obj =
+              CHECK(COFFObjectFile::create(mb),
+                    check(child.getName()) + ":" + ": not a valid COFF file");
+          machine = MachineTypes(obj->getMachine());
+          break;
+        }
+        case file_magic::coff_import_library:
+          machine = MachineTypes(COFFImportFile(mb).getMachine());
+          break;
+        case file_magic::bitcode: {
+          std::unique_ptr<lto::InputFile> obj =
+              check(lto::InputFile::create(mb));
+          machine = BitcodeFile::getMachineType(obj.get());
+          break;
+        }
+        default:
+          break;
+        }
+        archiveSymtab = &ctx.getSymtab(machine);
+      }
     }
   }
 
   // Read the symbol table to construct Lazy objects.
   for (const Archive::Symbol &sym : file->symbols())
-    ctx.symtab.addLazyArchive(this, sym);
+    archiveSymtab->addLazyArchive(this, sym);
 }
 
 // Returns a buffer pointing to a member file containing a given symbol.
diff --git a/lld/test/COFF/arm64x-symtab.s b/lld/test/COFF/arm64x-symtab.s
index 2b269dde61f61..5f06db150a06d 100644
--- a/lld/test/COFF/arm64x-symtab.s
+++ b/lld/test/COFF/arm64x-symtab.s
@@ -7,7 +7,11 @@
 // RUN: llvm-mc -filetype=obj -triple=aarch64-windows symref.s -o symref-aarch64.obj
 // RUN: llvm-mc -filetype=obj -triple=arm64ec-windows symref.s -o symref-arm64ec.obj
 // RUN: llvm-mc -filetype=obj -triple=x86_64-windows symref.s -o symref-x86_64.obj
+// RUN: llvm-as sym.ll -o sym.ll.obj
 // RUN: llvm-lib -machine:arm64x -out:sym.lib sym-aarch64.obj sym-arm64ec.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-x86_64.lib sym-x86_64.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-ll.lib sym.ll.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-imp.lib -def:sym.def
 
 // Check that native object files can't reference EC symbols.
 
@@ -40,12 +44,40 @@
 
 // RUN: lld-link -machine:arm64x -dll -noentry -out:out2.dll symref-aarch64.obj symref-arm64ec.obj sym.lib
 
+// Check that EC object files can reference x86_64 library symbols.
+
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out3.dll symref-arm64ec.obj sym-x86_64.lib
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out4.dll symref-arm64ec.obj sym-ll.lib
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out5.dll symref-arm64ec.obj sym-imp.lib
+
+// Check that native object files can't reference x86_64 library symbols.
+
+// RUN: not lld-link -machine:arm64x -dll -noentry -out:err3.dll symref-aarch64.obj sym-arm64ec.obj \
+// RUN:              2>&1 | FileCheck --check-prefix=UNDEF %s
+
 #--- symref.s
     .data
     .rva sym
 
+    .text
+    .globl __icall_helper_arm64ec
+__icall_helper_arm64ec:
+    ret
+
 #--- sym.s
      .data
      .globl sym
 sym:
      .word 0
+
+#--- sym.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-windows-msvc19.33.0"
+
+@sym = dso_local global i32 0, align 4
+
+#--- sym.def
+LIBRARY test.dll
+EXPORTS
+        Func
+        sym

@llvmbot
Copy link
Member

llvmbot commented Feb 21, 2025

@llvm/pr-subscribers-platform-windows

Author: Jacek Caban (cjacek)

Changes

If the ECSYMBOLS section is missing in the archive, the archive could be either a native-only ARM64 or x86_64 archive. Check the machine type of the object containing a symbol to determine which symbol table to use.


Full diff: https://github.com/llvm/llvm-project/pull/128241.diff

2 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (+42-1)
  • (modified) lld/test/COFF/arm64x-symtab.s (+32)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 7b105fb4c17a2..bb9f1407b51f7 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -29,6 +29,7 @@
 #include "llvm/LTO/LTO.h"
 #include "llvm/Object/Binary.h"
 #include "llvm/Object/COFF.h"
+#include "llvm/Object/COFFImportFile.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/Error.h"
@@ -122,6 +123,8 @@ ArchiveFile::ArchiveFile(COFFLinkerContext &ctx, MemoryBufferRef m)
 
 void ArchiveFile::parse() {
   COFFLinkerContext &ctx = symtab.ctx;
+  SymbolTable *archiveSymtab = &symtab;
+
   // Parse a MemoryBufferRef as an archive file.
   file = CHECK(Archive::create(mb), this);
 
@@ -136,12 +139,50 @@ void ArchiveFile::parse() {
       // Read both EC and native symbols on ARM64X.
       if (!ctx.hybridSymtab)
         return;
+    } else if (ctx.hybridSymtab) {
+      // If the ECSYMBOLS section is missing in the archive, the archive could
+      // be either a native-only ARM64 or x86_64 archive. Check the machine type
+      // of the object containing a symbol to determine which symbol table to
+      // use.
+      Archive::symbol_iterator sym = file->symbol_begin();
+      if (sym != file->symbol_end()) {
+        MachineTypes machine = IMAGE_FILE_MACHINE_UNKNOWN;
+        Archive::Child child =
+            CHECK(sym->getMember(),
+                  file->getFileName() +
+                      ": could not get the buffer for a child of the archive");
+        MemoryBufferRef mb = CHECK(
+            child.getMemoryBufferRef(),
+            file->getFileName() +
+                ": could not get the buffer for a child buffer of the archive");
+        switch (identify_magic(mb.getBuffer())) {
+        case file_magic::coff_object: {
+          std::unique_ptr<COFFObjectFile> obj =
+              CHECK(COFFObjectFile::create(mb),
+                    check(child.getName()) + ":" + ": not a valid COFF file");
+          machine = MachineTypes(obj->getMachine());
+          break;
+        }
+        case file_magic::coff_import_library:
+          machine = MachineTypes(COFFImportFile(mb).getMachine());
+          break;
+        case file_magic::bitcode: {
+          std::unique_ptr<lto::InputFile> obj =
+              check(lto::InputFile::create(mb));
+          machine = BitcodeFile::getMachineType(obj.get());
+          break;
+        }
+        default:
+          break;
+        }
+        archiveSymtab = &ctx.getSymtab(machine);
+      }
     }
   }
 
   // Read the symbol table to construct Lazy objects.
   for (const Archive::Symbol &sym : file->symbols())
-    ctx.symtab.addLazyArchive(this, sym);
+    archiveSymtab->addLazyArchive(this, sym);
 }
 
 // Returns a buffer pointing to a member file containing a given symbol.
diff --git a/lld/test/COFF/arm64x-symtab.s b/lld/test/COFF/arm64x-symtab.s
index 2b269dde61f61..5f06db150a06d 100644
--- a/lld/test/COFF/arm64x-symtab.s
+++ b/lld/test/COFF/arm64x-symtab.s
@@ -7,7 +7,11 @@
 // RUN: llvm-mc -filetype=obj -triple=aarch64-windows symref.s -o symref-aarch64.obj
 // RUN: llvm-mc -filetype=obj -triple=arm64ec-windows symref.s -o symref-arm64ec.obj
 // RUN: llvm-mc -filetype=obj -triple=x86_64-windows symref.s -o symref-x86_64.obj
+// RUN: llvm-as sym.ll -o sym.ll.obj
 // RUN: llvm-lib -machine:arm64x -out:sym.lib sym-aarch64.obj sym-arm64ec.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-x86_64.lib sym-x86_64.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-ll.lib sym.ll.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-imp.lib -def:sym.def
 
 // Check that native object files can't reference EC symbols.
 
@@ -40,12 +44,40 @@
 
 // RUN: lld-link -machine:arm64x -dll -noentry -out:out2.dll symref-aarch64.obj symref-arm64ec.obj sym.lib
 
+// Check that EC object files can reference x86_64 library symbols.
+
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out3.dll symref-arm64ec.obj sym-x86_64.lib
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out4.dll symref-arm64ec.obj sym-ll.lib
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out5.dll symref-arm64ec.obj sym-imp.lib
+
+// Check that native object files can't reference x86_64 library symbols.
+
+// RUN: not lld-link -machine:arm64x -dll -noentry -out:err3.dll symref-aarch64.obj sym-arm64ec.obj \
+// RUN:              2>&1 | FileCheck --check-prefix=UNDEF %s
+
 #--- symref.s
     .data
     .rva sym
 
+    .text
+    .globl __icall_helper_arm64ec
+__icall_helper_arm64ec:
+    ret
+
 #--- sym.s
      .data
      .globl sym
 sym:
      .word 0
+
+#--- sym.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-windows-msvc19.33.0"
+
+@sym = dso_local global i32 0, align 4
+
+#--- sym.def
+LIBRARY test.dll
+EXPORTS
+        Func
+        sym

@llvmbot
Copy link
Member

llvmbot commented Feb 21, 2025

@llvm/pr-subscribers-lld-coff

Author: Jacek Caban (cjacek)

Changes

If the ECSYMBOLS section is missing in the archive, the archive could be either a native-only ARM64 or x86_64 archive. Check the machine type of the object containing a symbol to determine which symbol table to use.


Full diff: https://github.com/llvm/llvm-project/pull/128241.diff

2 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (+42-1)
  • (modified) lld/test/COFF/arm64x-symtab.s (+32)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 7b105fb4c17a2..bb9f1407b51f7 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -29,6 +29,7 @@
 #include "llvm/LTO/LTO.h"
 #include "llvm/Object/Binary.h"
 #include "llvm/Object/COFF.h"
+#include "llvm/Object/COFFImportFile.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/Error.h"
@@ -122,6 +123,8 @@ ArchiveFile::ArchiveFile(COFFLinkerContext &ctx, MemoryBufferRef m)
 
 void ArchiveFile::parse() {
   COFFLinkerContext &ctx = symtab.ctx;
+  SymbolTable *archiveSymtab = &symtab;
+
   // Parse a MemoryBufferRef as an archive file.
   file = CHECK(Archive::create(mb), this);
 
@@ -136,12 +139,50 @@ void ArchiveFile::parse() {
       // Read both EC and native symbols on ARM64X.
       if (!ctx.hybridSymtab)
         return;
+    } else if (ctx.hybridSymtab) {
+      // If the ECSYMBOLS section is missing in the archive, the archive could
+      // be either a native-only ARM64 or x86_64 archive. Check the machine type
+      // of the object containing a symbol to determine which symbol table to
+      // use.
+      Archive::symbol_iterator sym = file->symbol_begin();
+      if (sym != file->symbol_end()) {
+        MachineTypes machine = IMAGE_FILE_MACHINE_UNKNOWN;
+        Archive::Child child =
+            CHECK(sym->getMember(),
+                  file->getFileName() +
+                      ": could not get the buffer for a child of the archive");
+        MemoryBufferRef mb = CHECK(
+            child.getMemoryBufferRef(),
+            file->getFileName() +
+                ": could not get the buffer for a child buffer of the archive");
+        switch (identify_magic(mb.getBuffer())) {
+        case file_magic::coff_object: {
+          std::unique_ptr<COFFObjectFile> obj =
+              CHECK(COFFObjectFile::create(mb),
+                    check(child.getName()) + ":" + ": not a valid COFF file");
+          machine = MachineTypes(obj->getMachine());
+          break;
+        }
+        case file_magic::coff_import_library:
+          machine = MachineTypes(COFFImportFile(mb).getMachine());
+          break;
+        case file_magic::bitcode: {
+          std::unique_ptr<lto::InputFile> obj =
+              check(lto::InputFile::create(mb));
+          machine = BitcodeFile::getMachineType(obj.get());
+          break;
+        }
+        default:
+          break;
+        }
+        archiveSymtab = &ctx.getSymtab(machine);
+      }
     }
   }
 
   // Read the symbol table to construct Lazy objects.
   for (const Archive::Symbol &sym : file->symbols())
-    ctx.symtab.addLazyArchive(this, sym);
+    archiveSymtab->addLazyArchive(this, sym);
 }
 
 // Returns a buffer pointing to a member file containing a given symbol.
diff --git a/lld/test/COFF/arm64x-symtab.s b/lld/test/COFF/arm64x-symtab.s
index 2b269dde61f61..5f06db150a06d 100644
--- a/lld/test/COFF/arm64x-symtab.s
+++ b/lld/test/COFF/arm64x-symtab.s
@@ -7,7 +7,11 @@
 // RUN: llvm-mc -filetype=obj -triple=aarch64-windows symref.s -o symref-aarch64.obj
 // RUN: llvm-mc -filetype=obj -triple=arm64ec-windows symref.s -o symref-arm64ec.obj
 // RUN: llvm-mc -filetype=obj -triple=x86_64-windows symref.s -o symref-x86_64.obj
+// RUN: llvm-as sym.ll -o sym.ll.obj
 // RUN: llvm-lib -machine:arm64x -out:sym.lib sym-aarch64.obj sym-arm64ec.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-x86_64.lib sym-x86_64.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-ll.lib sym.ll.obj
+// RUN: llvm-lib -machine:amd64 -out:sym-imp.lib -def:sym.def
 
 // Check that native object files can't reference EC symbols.
 
@@ -40,12 +44,40 @@
 
 // RUN: lld-link -machine:arm64x -dll -noentry -out:out2.dll symref-aarch64.obj symref-arm64ec.obj sym.lib
 
+// Check that EC object files can reference x86_64 library symbols.
+
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out3.dll symref-arm64ec.obj sym-x86_64.lib
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out4.dll symref-arm64ec.obj sym-ll.lib
+// RUN: lld-link -machine:arm64x -dll -noentry -out:out5.dll symref-arm64ec.obj sym-imp.lib
+
+// Check that native object files can't reference x86_64 library symbols.
+
+// RUN: not lld-link -machine:arm64x -dll -noentry -out:err3.dll symref-aarch64.obj sym-arm64ec.obj \
+// RUN:              2>&1 | FileCheck --check-prefix=UNDEF %s
+
 #--- symref.s
     .data
     .rva sym
 
+    .text
+    .globl __icall_helper_arm64ec
+__icall_helper_arm64ec:
+    ret
+
 #--- sym.s
      .data
      .globl sym
 sym:
      .word 0
+
+#--- sym.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-windows-msvc19.33.0"
+
+@sym = dso_local global i32 0, align 4
+
+#--- sym.def
+LIBRARY test.dll
+EXPORTS
+        Func
+        sym

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

If the ECSYMBOLS section is missing in the archive, the archive could be either
a native-only ARM64 or x86_64 archive. Check the machine type of the object
containing a symbol to determine which symbol table to use.
@cjacek cjacek merged commit b09dfbd into llvm:main Feb 22, 2025
6 of 11 checks passed
@cjacek cjacek deleted the arm64x-x86-lib branch February 22, 2025 10:21
@cjacek
Copy link
Contributor Author

cjacek commented Feb 22, 2025

Thanks! I added one more test (ensuring that we don't break native ARM64 archives) and merged.

SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Mar 10, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Mar 20, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Apr 2, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Apr 17, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Apr 30, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request May 15, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request May 29, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
SquallATF pushed a commit to SquallATF/llvm-project that referenced this pull request Jun 13, 2025
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants