Skip to content

[LLD][COFF] Swap the meaning of symtab and hybridSymtab in hybrid images #135093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Apr 9, 2025

Originally, the intent behind symtab was to represent the symbol table seen in the PE header (without applying ARM64X relocations). However, in most cases outside of writeHeader(), the code references either both symbol tables or only the EC one, for example, mainSymtab in linkerMain() maps to hybridSymtab on ARM64X.

MSVC's link.exe allows pure ARM64EC images to include native ARM64 files. This patch prepares LLD to support the same, which will require hybridSymtab to be available even for ARM64EC. At that point, writeHeader() will need to use the EC symbol table, and the original reasoning for keeping it in hybridSymtab no longer applies.

Given this, it seems cleaner to treat the EC symbol table as the “main” one, assigning it to symtab, and use hybridSymtab for the native symbol table instead. Since writeHeader() will need to be conditional anyway, this change simplifies the rest of the code by allowing other parts to consistently treat ctx.symtab as the main symbol table.

As a further simplification, this also allows us to eliminate symtabEC and use symtab directly; I’ll submit that as a separate PR.

The map file now uses the EC symbol table for printed entry points and exports, matching MSVC behavior.

@cjacek cjacek requested a review from mstorsjo April 9, 2025 22:33
cjacek added a commit to cjacek/llvm-project that referenced this pull request Apr 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 9, 2025

@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld-coff

Author: Jacek Caban (cjacek)

Changes

Originally, the intent behind symtab was to represent the symbol table seen in the PE header (without applying ARM64X relocations). However, in most cases outside of writeHeader(), the code references either both symbol tables or only the EC one, for example, mainSymtab in linkerMain() maps to hybridSymtab on ARM64X.

MSVC's link.exe allows pure ARM64EC images to include native ARM64 files. This patch prepares LLD to support the same, which will require hybridSymtab to be available even for ARM64EC. At that point, writeHeader() will need to use the EC symbol table, and the original reasoning for keeping it in hybridSymtab no longer applies.

Given this, it seems cleaner to treat the EC symbol table as the “main” one, assigning it to symtab, and use hybridSymtab for the native symbol table instead. Since writeHeader() will need to be conditional anyway, this change simplifies the rest of the code by allowing other parts to consistently treat ctx.symtab as the main symbol table.

As a further simplification, this also allows us to eliminate symtabEC and use symtab directly; I’ll submit that as a separate PR.

The map file now uses the EC symbol table for printed entry points and exports, matching MSVC behavior.


Full diff: https://github.com/llvm/llvm-project/pull/135093.diff

6 Files Affected:

  • (modified) lld/COFF/COFFLinkerContext.h (+3-3)
  • (modified) lld/COFF/Chunks.cpp (+5-5)
  • (modified) lld/COFF/Driver.cpp (+14-21)
  • (modified) lld/COFF/InputFiles.cpp (+1)
  • (modified) lld/COFF/Writer.cpp (+27-27)
  • (added) lld/test/COFF/arm64x-map.s (+60)
diff --git a/lld/COFF/COFFLinkerContext.h b/lld/COFF/COFFLinkerContext.h
index 8322f829d4055..d01a34899c57d 100644
--- a/lld/COFF/COFFLinkerContext.h
+++ b/lld/COFF/COFFLinkerContext.h
@@ -32,7 +32,7 @@ class COFFLinkerContext : public CommonLinkerContext {
   SymbolTable symtab;
   COFFOptTable optTable;
 
-  // A hybrid ARM64EC symbol table on ARM64X target.
+  // A native ARM64 symbol table on ARM64X target.
   std::optional<SymbolTable> hybridSymtab;
 
   // Pointer to the ARM64EC symbol table: either symtab for an ARM64EC target or
@@ -41,16 +41,16 @@ class COFFLinkerContext : public CommonLinkerContext {
 
   // Returns the appropriate symbol table for the specified machine type.
   SymbolTable &getSymtab(llvm::COFF::MachineTypes machine) {
-    if (hybridSymtab && (machine == ARM64EC || machine == AMD64))
+    if (hybridSymtab && machine == ARM64)
       return *hybridSymtab;
     return symtab;
   }
 
   // Invoke the specified callback for each symbol table.
   void forEachSymtab(std::function<void(SymbolTable &symtab)> f) {
-    f(symtab);
     if (hybridSymtab)
       f(*hybridSymtab);
+    f(symtab);
   }
 
   std::vector<ObjFile *> objFileInstances;
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 3494d1ba0ac02..3920158923858 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -570,14 +570,14 @@ void SectionChunk::getBaserels(std::vector<Baserel> *res) {
   // to match the value in the EC load config, which is expected to be
   // a relocatable pointer to the __chpe_metadata symbol.
   COFFLinkerContext &ctx = file->symtab.ctx;
-  if (ctx.hybridSymtab && ctx.symtab.loadConfigSym &&
-      ctx.symtab.loadConfigSym->getChunk() == this &&
-      ctx.hybridSymtab->loadConfigSym &&
-      ctx.symtab.loadConfigSize >=
+  if (ctx.hybridSymtab && ctx.hybridSymtab->loadConfigSym &&
+      ctx.hybridSymtab->loadConfigSym->getChunk() == this &&
+      ctx.symtab.loadConfigSym &&
+      ctx.hybridSymtab->loadConfigSize >=
           offsetof(coff_load_configuration64, CHPEMetadataPointer) +
               sizeof(coff_load_configuration64::CHPEMetadataPointer))
     res->emplace_back(
-        ctx.symtab.loadConfigSym->getRVA() +
+        ctx.hybridSymtab->loadConfigSym->getRVA() +
             offsetof(coff_load_configuration64, CHPEMetadataPointer),
         IMAGE_REL_BASED_DIR64);
 }
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 7aa13bdce488e..d9e70a5be3261 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -662,9 +662,9 @@ void LinkerDriver::setMachine(MachineTypes machine) {
     if (machine == ARM64EC)
       ctx.symtabEC = &ctx.symtab;
   } else {
-    ctx.symtab.machine = ARM64;
-    ctx.hybridSymtab.emplace(ctx, ARM64EC);
-    ctx.symtabEC = &*ctx.hybridSymtab;
+    ctx.symtab.machine = ARM64EC;
+    ctx.hybridSymtab.emplace(ctx, ARM64);
+    ctx.symtabEC = &ctx.symtab;
   }
 
   addWinSysRootLibSearchPaths();
@@ -981,12 +981,9 @@ void LinkerDriver::createImportLibrary(bool asLib) {
     }
   };
 
-  if (ctx.hybridSymtab) {
-    getExports(ctx.symtab, nativeExports);
-    getExports(*ctx.hybridSymtab, exports);
-  } else {
-    getExports(ctx.symtab, exports);
-  }
+  getExports(ctx.symtab, exports);
+  if (ctx.hybridSymtab)
+    getExports(*ctx.hybridSymtab, nativeExports);
 
   std::string libName = getImportName(asLib);
   std::string path = getImplibPath();
@@ -1818,10 +1815,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
     }
   }
 
-  // Most of main arguments apply either to both or only to EC symbol table on
-  // ARM64X target.
-  SymbolTable &mainSymtab = ctx.hybridSymtab ? *ctx.hybridSymtab : ctx.symtab;
-
   // Handle /nodefaultlib:<filename>
   {
     llvm::TimeTraceScope timeScope2("Nodefaultlib");
@@ -1903,11 +1896,11 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /alternatename
   for (auto *arg : args.filtered(OPT_alternatename))
-    mainSymtab.parseAlternateName(arg->getValue());
+    ctx.symtab.parseAlternateName(arg->getValue());
 
   // Handle /include
   for (auto *arg : args.filtered(OPT_incl))
-    mainSymtab.addGCRoot(arg->getValue());
+    ctx.symtab.addGCRoot(arg->getValue());
 
   // Handle /implib
   if (auto *arg = args.getLastArg(OPT_implib))
@@ -2056,7 +2049,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /aligncomm
   for (auto *arg : args.filtered(OPT_aligncomm))
-    mainSymtab.parseAligncomm(arg->getValue());
+    ctx.symtab.parseAligncomm(arg->getValue());
 
   // Handle /manifestdependency.
   for (auto *arg : args.filtered(OPT_manifestdependency))
@@ -2307,19 +2300,19 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
         if (!e.extName.empty() && !isDecorated(e.extName))
           e.extName = saver().save("_" + e.extName);
       }
-      mainSymtab.exports.push_back(e);
+      ctx.symtab.exports.push_back(e);
     }
   }
 
   // Handle /def
   if (auto *arg = args.getLastArg(OPT_deffile)) {
     // parseModuleDefs mutates Config object.
-    mainSymtab.parseModuleDefs(arg->getValue());
+    ctx.symtab.parseModuleDefs(arg->getValue());
     if (ctx.hybridSymtab) {
       // MSVC ignores the /defArm64Native argument on non-ARM64X targets.
       // It is also ignored if the /def option is not specified.
       if (auto *arg = args.getLastArg(OPT_defarm64native))
-        ctx.symtab.parseModuleDefs(arg->getValue());
+        ctx.hybridSymtab->parseModuleDefs(arg->getValue());
     }
   }
 
@@ -2336,7 +2329,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   // and after the early return when just writing an import library.
   if (config->subsystem == IMAGE_SUBSYSTEM_UNKNOWN) {
     llvm::TimeTraceScope timeScope("Infer subsystem");
-    config->subsystem = mainSymtab.inferSubsystem();
+    config->subsystem = ctx.symtab.inferSubsystem();
     if (config->subsystem == IMAGE_SUBSYSTEM_UNKNOWN)
       Fatal(ctx) << "subsystem must be defined";
   }
@@ -2702,7 +2695,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /output-def (MinGW specific).
   if (auto *arg = args.getLastArg(OPT_output_def))
-    writeDefFile(ctx, arg->getValue(), mainSymtab.exports);
+    writeDefFile(ctx, arg->getValue(), ctx.symtab.exports);
 
   // Set extra alignment for .comm symbols
   ctx.forEachSymtab([&](SymbolTable &symtab) {
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 256bec6bb636c..7916e3fd4ea1f 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -139,6 +139,7 @@ void ArchiveFile::parse() {
       // Read both EC and native symbols on ARM64X.
       if (!ctx.hybridSymtab)
         return;
+      archiveSymtab = &*ctx.hybridSymtab;
     } else if (ctx.hybridSymtab) {
       // If the ECSYMBOLS section is missing in the archive, the archive could
       // be either a native-only ARM64 or x86_64 archive. Check the machine type
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 6ed1f884a9636..31c686761d74f 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -1359,14 +1359,14 @@ void Writer::createExportTable() {
 
       for (auto chunk : edataSec->chunks) {
         if (chunk->getMachine() != ARM64) {
-          ctx.hybridSymtab->edataStart = chunk;
-          ctx.hybridSymtab->edataEnd = edataSec->chunks.back();
+          ctx.symtab.edataStart = chunk;
+          ctx.symtab.edataEnd = edataSec->chunks.back();
           break;
         }
 
-        if (!ctx.symtab.edataStart)
-          ctx.symtab.edataStart = chunk;
-        ctx.symtab.edataEnd = chunk;
+        if (!ctx.hybridSymtab->edataStart)
+          ctx.hybridSymtab->edataStart = chunk;
+        ctx.hybridSymtab->edataEnd = chunk;
       }
     }
   }
@@ -1760,7 +1760,8 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
   assert(coffHeaderOffset == buf - buffer->getBufferStart());
   auto *coff = reinterpret_cast<coff_file_header *>(buf);
   buf += sizeof(*coff);
-  coff->Machine = ctx.symtab.isEC() ? AMD64 : ctx.symtab.machine;
+  SymbolTable &symtab = ctx.hybridSymtab ? *ctx.hybridSymtab : ctx.symtab;
+  coff->Machine = symtab.isEC() ? AMD64 : symtab.machine;
   coff->NumberOfSections = ctx.outputSections.size();
   coff->Characteristics = IMAGE_FILE_EXECUTABLE_IMAGE;
   if (config->largeAddressAware)
@@ -1807,7 +1808,7 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
   pe->SizeOfImage = sizeOfImage;
   pe->SizeOfHeaders = sizeOfHeaders;
   if (!config->noEntry) {
-    Defined *entry = cast<Defined>(ctx.symtab.entry);
+    Defined *entry = cast<Defined>(symtab.entry);
     pe->AddressOfEntryPoint = entry->getRVA();
     // Pointer to thumb code must have the LSB set, so adjust it.
     if (config->machine == ARMNT)
@@ -1851,11 +1852,11 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
          dataDirOffset64 == buf - buffer->getBufferStart());
   auto *dir = reinterpret_cast<data_directory *>(buf);
   buf += sizeof(*dir) * numberOfDataDirectory;
-  if (ctx.symtab.edataStart) {
-    dir[EXPORT_TABLE].RelativeVirtualAddress = ctx.symtab.edataStart->getRVA();
-    dir[EXPORT_TABLE].Size = ctx.symtab.edataEnd->getRVA() +
-                             ctx.symtab.edataEnd->getSize() -
-                             ctx.symtab.edataStart->getRVA();
+  if (symtab.edataStart) {
+    dir[EXPORT_TABLE].RelativeVirtualAddress = symtab.edataStart->getRVA();
+    dir[EXPORT_TABLE].Size = symtab.edataEnd->getRVA() +
+                             symtab.edataEnd->getSize() -
+                             symtab.edataStart->getRVA();
   }
   if (importTableStart) {
     dir[IMPORT_TABLE].RelativeVirtualAddress = importTableStart->getRVA();
@@ -1886,7 +1887,7 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
     dir[BASE_RELOCATION_TABLE].RelativeVirtualAddress = relocSec->getRVA();
     dir[BASE_RELOCATION_TABLE].Size = relocSize;
   }
-  if (Symbol *sym = ctx.symtab.findUnderscore("_tls_used")) {
+  if (Symbol *sym = symtab.findUnderscore("_tls_used")) {
     if (Defined *b = dyn_cast<Defined>(sym)) {
       dir[TLS_TABLE].RelativeVirtualAddress = b->getRVA();
       dir[TLS_TABLE].Size = config->is64()
@@ -1898,10 +1899,10 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
     dir[DEBUG_DIRECTORY].RelativeVirtualAddress = debugDirectory->getRVA();
     dir[DEBUG_DIRECTORY].Size = debugDirectory->getSize();
   }
-  if (ctx.symtab.loadConfigSym) {
+  if (symtab.loadConfigSym) {
     dir[LOAD_CONFIG_TABLE].RelativeVirtualAddress =
-        ctx.symtab.loadConfigSym->getRVA();
-    dir[LOAD_CONFIG_TABLE].Size = ctx.symtab.loadConfigSize;
+        symtab.loadConfigSym->getRVA();
+    dir[LOAD_CONFIG_TABLE].Size = symtab.loadConfigSize;
   }
   if (!delayIdata.empty()) {
     dir[DELAY_IMPORT_DESCRIPTOR].RelativeVirtualAddress =
@@ -2442,7 +2443,7 @@ void Writer::setECSymbols() {
     // For the hybrid image, set the alternate entry point to the EC entry
     // point. In the hybrid view, it is swapped to the native entry point
     // using ARM64X relocations.
-    if (auto altEntrySym = cast_or_null<Defined>(ctx.hybridSymtab->entry)) {
+    if (auto altEntrySym = cast_or_null<Defined>(ctx.symtab.entry)) {
       // If the entry is an EC export thunk, use its target instead.
       if (auto thunkChunk =
               dyn_cast<ECExportThunkChunk>(altEntrySym->getChunk()))
@@ -2711,7 +2712,7 @@ void Writer::createDynamicRelocs() {
   if (ctx.symtab.entry != ctx.hybridSymtab->entry ||
       pdata.first != hybridPdata.first) {
     chpeSym = cast_or_null<DefinedRegular>(
-        ctx.hybridSymtab->findUnderscore("__chpe_metadata"));
+        ctx.symtab.findUnderscore("__chpe_metadata"));
     if (!chpeSym)
       Warn(ctx) << "'__chpe_metadata' is missing for ARM64X target";
   }
@@ -2720,14 +2721,14 @@ void Writer::createDynamicRelocs() {
     ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
                            peHeaderOffset +
                                offsetof(pe32plus_header, AddressOfEntryPoint),
-                           cast_or_null<Defined>(ctx.hybridSymtab->entry));
+                           cast_or_null<Defined>(ctx.symtab.entry));
 
     // Swap the alternate entry point in the CHPE metadata.
     if (chpeSym)
       ctx.dynamicRelocs->add(
           IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
           Arm64XRelocVal(chpeSym, offsetof(chpe_metadata, AlternateEntryPoint)),
-          cast_or_null<Defined>(ctx.symtab.entry));
+          cast_or_null<Defined>(ctx.hybridSymtab->entry));
   }
 
   if (ctx.symtab.edataStart != ctx.hybridSymtab->edataStart) {
@@ -2735,7 +2736,7 @@ void Writer::createDynamicRelocs() {
                            dataDirOffset64 +
                                EXPORT_TABLE * sizeof(data_directory) +
                                offsetof(data_directory, RelativeVirtualAddress),
-                           ctx.hybridSymtab->edataStart);
+                           ctx.symtab.edataStart);
     // The Size value is assigned after addresses are finalized.
     ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
                            dataDirOffset64 +
@@ -2773,12 +2774,12 @@ void Writer::createDynamicRelocs() {
                          dataDirOffset64 +
                              LOAD_CONFIG_TABLE * sizeof(data_directory) +
                              offsetof(data_directory, RelativeVirtualAddress),
-                         ctx.hybridSymtab->loadConfigSym);
+                         ctx.symtab.loadConfigSym);
   ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
                          dataDirOffset64 +
                              LOAD_CONFIG_TABLE * sizeof(data_directory) +
                              offsetof(data_directory, Size),
-                         ctx.hybridSymtab->loadConfigSize);
+                         ctx.symtab.loadConfigSize);
 }
 
 PartialSection *Writer::createPartialSection(StringRef name,
@@ -2889,15 +2890,14 @@ void Writer::prepareLoadConfig(SymbolTable &symtab, T *loadConfig) {
     // On ARM64X, only the EC version of the load config contains
     // CHPEMetadataPointer. Copy its value to the native load config.
     if (ctx.hybridSymtab && !symtab.isEC() &&
-        ctx.hybridSymtab->loadConfigSize >=
+        ctx.symtab.loadConfigSize >=
             offsetof(T, CHPEMetadataPointer) + sizeof(T::CHPEMetadataPointer)) {
       OutputSection *sec =
-          ctx.getOutputSection(ctx.hybridSymtab->loadConfigSym->getChunk());
+          ctx.getOutputSection(ctx.symtab.loadConfigSym->getChunk());
       uint8_t *secBuf = buffer->getBufferStart() + sec->getFileOff();
       auto hybridLoadConfig =
           reinterpret_cast<const coff_load_configuration64 *>(
-              secBuf +
-              (ctx.hybridSymtab->loadConfigSym->getRVA() - sec->getRVA()));
+              secBuf + (ctx.symtab.loadConfigSym->getRVA() - sec->getRVA()));
       loadConfig->CHPEMetadataPointer = hybridLoadConfig->CHPEMetadataPointer;
     }
   }
diff --git a/lld/test/COFF/arm64x-map.s b/lld/test/COFF/arm64x-map.s
new file mode 100644
index 0000000000000..956c52a93b041
--- /dev/null
+++ b/lld/test/COFF/arm64x-map.s
@@ -0,0 +1,60 @@
+// REQUIRES: aarch64
+// RUN: split-file %s %t.dir && cd %t.dir
+
+// RUN: llvm-mc -filetype=obj -triple=aarch64-windows arm64-data-sym.s -o arm64-data-sym.obj
+// RUN: llvm-mc -filetype=obj -triple=arm64ec-windows arm64ec-data-sym.s -o arm64ec-data-sym.obj
+// RUN: lld-link -machine:arm64x -dll -out:out.dll -map -mapinfo:exports arm64-data-sym.obj arm64ec-data-sym.obj
+// RUN: FileCheck %s < out.map
+
+// CHECK:      Start         Length     Name                   Class
+// CHECK-NEXT: 0001:00000000 00001004H .text                   CODE
+// CHECK-NEXT: 0004:00000000 00000008H .data                   DATA
+// CHECK-NEXT: 0004:00000008 00000000H .bss                    DATA
+// CHECK-EMPTY:
+// CHECK-NEXT:  Address         Publics by Value              Rva+Base               Lib:Object
+// CHECK-EMPTY:
+// CHECK-NEXT: 0001:00000000       _DllMainCRTStartup         0000000180001000     arm64-data-sym.obj
+// CHECK-NEXT: 0001:00001000       _DllMainCRTStartup         0000000180002000     arm64ec-data-sym.obj
+// CHECK-NEXT: 0004:00000000       arm64_data_sym             0000000180005000     arm64-data-sym.obj
+// CHECK-NEXT: 0004:00000004       arm64ec_data_sym           0000000180005004     arm64ec-data-sym.obj
+// CHECK-EMPTY:
+// CHECK-NEXT: entry point at         0002:00000000
+// CHECK-EMPTY:
+// CHECK-NEXT: Static symbols
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK-NEXT: Exports
+// CHECK-EMPTY:
+// CHECK-NEXT:  ordinal    name
+// CHECK-EMPTY:
+// CHECK-NEXT:        1    arm64ec_data_sym
+
+#--- arm64ec-data-sym.s
+        .text
+        .globl _DllMainCRTStartup
+_DllMainCRTStartup:
+        ret
+
+        .data
+        .globl arm64ec_data_sym
+        .p2align 2, 0x0
+arm64ec_data_sym:
+        .word 0x02020202
+
+        .section .drectve
+        .ascii "-export:arm64ec_data_sym,DATA"
+
+#--- arm64-data-sym.s
+        .text
+        .globl _DllMainCRTStartup
+_DllMainCRTStartup:
+        ret
+
+        .data
+        .globl arm64_data_sym
+        .p2align 2, 0x0
+arm64_data_sym:
+        .word 0x01010101
+
+        .section .drectve
+        .ascii "-export:arm64_data_sym,DATA"

@llvmbot
Copy link
Member

llvmbot commented Apr 9, 2025

@llvm/pr-subscribers-lld

Author: Jacek Caban (cjacek)

Changes

Originally, the intent behind symtab was to represent the symbol table seen in the PE header (without applying ARM64X relocations). However, in most cases outside of writeHeader(), the code references either both symbol tables or only the EC one, for example, mainSymtab in linkerMain() maps to hybridSymtab on ARM64X.

MSVC's link.exe allows pure ARM64EC images to include native ARM64 files. This patch prepares LLD to support the same, which will require hybridSymtab to be available even for ARM64EC. At that point, writeHeader() will need to use the EC symbol table, and the original reasoning for keeping it in hybridSymtab no longer applies.

Given this, it seems cleaner to treat the EC symbol table as the “main” one, assigning it to symtab, and use hybridSymtab for the native symbol table instead. Since writeHeader() will need to be conditional anyway, this change simplifies the rest of the code by allowing other parts to consistently treat ctx.symtab as the main symbol table.

As a further simplification, this also allows us to eliminate symtabEC and use symtab directly; I’ll submit that as a separate PR.

The map file now uses the EC symbol table for printed entry points and exports, matching MSVC behavior.


Full diff: https://github.com/llvm/llvm-project/pull/135093.diff

6 Files Affected:

  • (modified) lld/COFF/COFFLinkerContext.h (+3-3)
  • (modified) lld/COFF/Chunks.cpp (+5-5)
  • (modified) lld/COFF/Driver.cpp (+14-21)
  • (modified) lld/COFF/InputFiles.cpp (+1)
  • (modified) lld/COFF/Writer.cpp (+27-27)
  • (added) lld/test/COFF/arm64x-map.s (+60)
diff --git a/lld/COFF/COFFLinkerContext.h b/lld/COFF/COFFLinkerContext.h
index 8322f829d4055..d01a34899c57d 100644
--- a/lld/COFF/COFFLinkerContext.h
+++ b/lld/COFF/COFFLinkerContext.h
@@ -32,7 +32,7 @@ class COFFLinkerContext : public CommonLinkerContext {
   SymbolTable symtab;
   COFFOptTable optTable;
 
-  // A hybrid ARM64EC symbol table on ARM64X target.
+  // A native ARM64 symbol table on ARM64X target.
   std::optional<SymbolTable> hybridSymtab;
 
   // Pointer to the ARM64EC symbol table: either symtab for an ARM64EC target or
@@ -41,16 +41,16 @@ class COFFLinkerContext : public CommonLinkerContext {
 
   // Returns the appropriate symbol table for the specified machine type.
   SymbolTable &getSymtab(llvm::COFF::MachineTypes machine) {
-    if (hybridSymtab && (machine == ARM64EC || machine == AMD64))
+    if (hybridSymtab && machine == ARM64)
       return *hybridSymtab;
     return symtab;
   }
 
   // Invoke the specified callback for each symbol table.
   void forEachSymtab(std::function<void(SymbolTable &symtab)> f) {
-    f(symtab);
     if (hybridSymtab)
       f(*hybridSymtab);
+    f(symtab);
   }
 
   std::vector<ObjFile *> objFileInstances;
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 3494d1ba0ac02..3920158923858 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -570,14 +570,14 @@ void SectionChunk::getBaserels(std::vector<Baserel> *res) {
   // to match the value in the EC load config, which is expected to be
   // a relocatable pointer to the __chpe_metadata symbol.
   COFFLinkerContext &ctx = file->symtab.ctx;
-  if (ctx.hybridSymtab && ctx.symtab.loadConfigSym &&
-      ctx.symtab.loadConfigSym->getChunk() == this &&
-      ctx.hybridSymtab->loadConfigSym &&
-      ctx.symtab.loadConfigSize >=
+  if (ctx.hybridSymtab && ctx.hybridSymtab->loadConfigSym &&
+      ctx.hybridSymtab->loadConfigSym->getChunk() == this &&
+      ctx.symtab.loadConfigSym &&
+      ctx.hybridSymtab->loadConfigSize >=
           offsetof(coff_load_configuration64, CHPEMetadataPointer) +
               sizeof(coff_load_configuration64::CHPEMetadataPointer))
     res->emplace_back(
-        ctx.symtab.loadConfigSym->getRVA() +
+        ctx.hybridSymtab->loadConfigSym->getRVA() +
             offsetof(coff_load_configuration64, CHPEMetadataPointer),
         IMAGE_REL_BASED_DIR64);
 }
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 7aa13bdce488e..d9e70a5be3261 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -662,9 +662,9 @@ void LinkerDriver::setMachine(MachineTypes machine) {
     if (machine == ARM64EC)
       ctx.symtabEC = &ctx.symtab;
   } else {
-    ctx.symtab.machine = ARM64;
-    ctx.hybridSymtab.emplace(ctx, ARM64EC);
-    ctx.symtabEC = &*ctx.hybridSymtab;
+    ctx.symtab.machine = ARM64EC;
+    ctx.hybridSymtab.emplace(ctx, ARM64);
+    ctx.symtabEC = &ctx.symtab;
   }
 
   addWinSysRootLibSearchPaths();
@@ -981,12 +981,9 @@ void LinkerDriver::createImportLibrary(bool asLib) {
     }
   };
 
-  if (ctx.hybridSymtab) {
-    getExports(ctx.symtab, nativeExports);
-    getExports(*ctx.hybridSymtab, exports);
-  } else {
-    getExports(ctx.symtab, exports);
-  }
+  getExports(ctx.symtab, exports);
+  if (ctx.hybridSymtab)
+    getExports(*ctx.hybridSymtab, nativeExports);
 
   std::string libName = getImportName(asLib);
   std::string path = getImplibPath();
@@ -1818,10 +1815,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
     }
   }
 
-  // Most of main arguments apply either to both or only to EC symbol table on
-  // ARM64X target.
-  SymbolTable &mainSymtab = ctx.hybridSymtab ? *ctx.hybridSymtab : ctx.symtab;
-
   // Handle /nodefaultlib:<filename>
   {
     llvm::TimeTraceScope timeScope2("Nodefaultlib");
@@ -1903,11 +1896,11 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /alternatename
   for (auto *arg : args.filtered(OPT_alternatename))
-    mainSymtab.parseAlternateName(arg->getValue());
+    ctx.symtab.parseAlternateName(arg->getValue());
 
   // Handle /include
   for (auto *arg : args.filtered(OPT_incl))
-    mainSymtab.addGCRoot(arg->getValue());
+    ctx.symtab.addGCRoot(arg->getValue());
 
   // Handle /implib
   if (auto *arg = args.getLastArg(OPT_implib))
@@ -2056,7 +2049,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /aligncomm
   for (auto *arg : args.filtered(OPT_aligncomm))
-    mainSymtab.parseAligncomm(arg->getValue());
+    ctx.symtab.parseAligncomm(arg->getValue());
 
   // Handle /manifestdependency.
   for (auto *arg : args.filtered(OPT_manifestdependency))
@@ -2307,19 +2300,19 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
         if (!e.extName.empty() && !isDecorated(e.extName))
           e.extName = saver().save("_" + e.extName);
       }
-      mainSymtab.exports.push_back(e);
+      ctx.symtab.exports.push_back(e);
     }
   }
 
   // Handle /def
   if (auto *arg = args.getLastArg(OPT_deffile)) {
     // parseModuleDefs mutates Config object.
-    mainSymtab.parseModuleDefs(arg->getValue());
+    ctx.symtab.parseModuleDefs(arg->getValue());
     if (ctx.hybridSymtab) {
       // MSVC ignores the /defArm64Native argument on non-ARM64X targets.
       // It is also ignored if the /def option is not specified.
       if (auto *arg = args.getLastArg(OPT_defarm64native))
-        ctx.symtab.parseModuleDefs(arg->getValue());
+        ctx.hybridSymtab->parseModuleDefs(arg->getValue());
     }
   }
 
@@ -2336,7 +2329,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   // and after the early return when just writing an import library.
   if (config->subsystem == IMAGE_SUBSYSTEM_UNKNOWN) {
     llvm::TimeTraceScope timeScope("Infer subsystem");
-    config->subsystem = mainSymtab.inferSubsystem();
+    config->subsystem = ctx.symtab.inferSubsystem();
     if (config->subsystem == IMAGE_SUBSYSTEM_UNKNOWN)
       Fatal(ctx) << "subsystem must be defined";
   }
@@ -2702,7 +2695,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /output-def (MinGW specific).
   if (auto *arg = args.getLastArg(OPT_output_def))
-    writeDefFile(ctx, arg->getValue(), mainSymtab.exports);
+    writeDefFile(ctx, arg->getValue(), ctx.symtab.exports);
 
   // Set extra alignment for .comm symbols
   ctx.forEachSymtab([&](SymbolTable &symtab) {
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 256bec6bb636c..7916e3fd4ea1f 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -139,6 +139,7 @@ void ArchiveFile::parse() {
       // Read both EC and native symbols on ARM64X.
       if (!ctx.hybridSymtab)
         return;
+      archiveSymtab = &*ctx.hybridSymtab;
     } else if (ctx.hybridSymtab) {
       // If the ECSYMBOLS section is missing in the archive, the archive could
       // be either a native-only ARM64 or x86_64 archive. Check the machine type
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 6ed1f884a9636..31c686761d74f 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -1359,14 +1359,14 @@ void Writer::createExportTable() {
 
       for (auto chunk : edataSec->chunks) {
         if (chunk->getMachine() != ARM64) {
-          ctx.hybridSymtab->edataStart = chunk;
-          ctx.hybridSymtab->edataEnd = edataSec->chunks.back();
+          ctx.symtab.edataStart = chunk;
+          ctx.symtab.edataEnd = edataSec->chunks.back();
           break;
         }
 
-        if (!ctx.symtab.edataStart)
-          ctx.symtab.edataStart = chunk;
-        ctx.symtab.edataEnd = chunk;
+        if (!ctx.hybridSymtab->edataStart)
+          ctx.hybridSymtab->edataStart = chunk;
+        ctx.hybridSymtab->edataEnd = chunk;
       }
     }
   }
@@ -1760,7 +1760,8 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
   assert(coffHeaderOffset == buf - buffer->getBufferStart());
   auto *coff = reinterpret_cast<coff_file_header *>(buf);
   buf += sizeof(*coff);
-  coff->Machine = ctx.symtab.isEC() ? AMD64 : ctx.symtab.machine;
+  SymbolTable &symtab = ctx.hybridSymtab ? *ctx.hybridSymtab : ctx.symtab;
+  coff->Machine = symtab.isEC() ? AMD64 : symtab.machine;
   coff->NumberOfSections = ctx.outputSections.size();
   coff->Characteristics = IMAGE_FILE_EXECUTABLE_IMAGE;
   if (config->largeAddressAware)
@@ -1807,7 +1808,7 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
   pe->SizeOfImage = sizeOfImage;
   pe->SizeOfHeaders = sizeOfHeaders;
   if (!config->noEntry) {
-    Defined *entry = cast<Defined>(ctx.symtab.entry);
+    Defined *entry = cast<Defined>(symtab.entry);
     pe->AddressOfEntryPoint = entry->getRVA();
     // Pointer to thumb code must have the LSB set, so adjust it.
     if (config->machine == ARMNT)
@@ -1851,11 +1852,11 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
          dataDirOffset64 == buf - buffer->getBufferStart());
   auto *dir = reinterpret_cast<data_directory *>(buf);
   buf += sizeof(*dir) * numberOfDataDirectory;
-  if (ctx.symtab.edataStart) {
-    dir[EXPORT_TABLE].RelativeVirtualAddress = ctx.symtab.edataStart->getRVA();
-    dir[EXPORT_TABLE].Size = ctx.symtab.edataEnd->getRVA() +
-                             ctx.symtab.edataEnd->getSize() -
-                             ctx.symtab.edataStart->getRVA();
+  if (symtab.edataStart) {
+    dir[EXPORT_TABLE].RelativeVirtualAddress = symtab.edataStart->getRVA();
+    dir[EXPORT_TABLE].Size = symtab.edataEnd->getRVA() +
+                             symtab.edataEnd->getSize() -
+                             symtab.edataStart->getRVA();
   }
   if (importTableStart) {
     dir[IMPORT_TABLE].RelativeVirtualAddress = importTableStart->getRVA();
@@ -1886,7 +1887,7 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
     dir[BASE_RELOCATION_TABLE].RelativeVirtualAddress = relocSec->getRVA();
     dir[BASE_RELOCATION_TABLE].Size = relocSize;
   }
-  if (Symbol *sym = ctx.symtab.findUnderscore("_tls_used")) {
+  if (Symbol *sym = symtab.findUnderscore("_tls_used")) {
     if (Defined *b = dyn_cast<Defined>(sym)) {
       dir[TLS_TABLE].RelativeVirtualAddress = b->getRVA();
       dir[TLS_TABLE].Size = config->is64()
@@ -1898,10 +1899,10 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
     dir[DEBUG_DIRECTORY].RelativeVirtualAddress = debugDirectory->getRVA();
     dir[DEBUG_DIRECTORY].Size = debugDirectory->getSize();
   }
-  if (ctx.symtab.loadConfigSym) {
+  if (symtab.loadConfigSym) {
     dir[LOAD_CONFIG_TABLE].RelativeVirtualAddress =
-        ctx.symtab.loadConfigSym->getRVA();
-    dir[LOAD_CONFIG_TABLE].Size = ctx.symtab.loadConfigSize;
+        symtab.loadConfigSym->getRVA();
+    dir[LOAD_CONFIG_TABLE].Size = symtab.loadConfigSize;
   }
   if (!delayIdata.empty()) {
     dir[DELAY_IMPORT_DESCRIPTOR].RelativeVirtualAddress =
@@ -2442,7 +2443,7 @@ void Writer::setECSymbols() {
     // For the hybrid image, set the alternate entry point to the EC entry
     // point. In the hybrid view, it is swapped to the native entry point
     // using ARM64X relocations.
-    if (auto altEntrySym = cast_or_null<Defined>(ctx.hybridSymtab->entry)) {
+    if (auto altEntrySym = cast_or_null<Defined>(ctx.symtab.entry)) {
       // If the entry is an EC export thunk, use its target instead.
       if (auto thunkChunk =
               dyn_cast<ECExportThunkChunk>(altEntrySym->getChunk()))
@@ -2711,7 +2712,7 @@ void Writer::createDynamicRelocs() {
   if (ctx.symtab.entry != ctx.hybridSymtab->entry ||
       pdata.first != hybridPdata.first) {
     chpeSym = cast_or_null<DefinedRegular>(
-        ctx.hybridSymtab->findUnderscore("__chpe_metadata"));
+        ctx.symtab.findUnderscore("__chpe_metadata"));
     if (!chpeSym)
       Warn(ctx) << "'__chpe_metadata' is missing for ARM64X target";
   }
@@ -2720,14 +2721,14 @@ void Writer::createDynamicRelocs() {
     ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
                            peHeaderOffset +
                                offsetof(pe32plus_header, AddressOfEntryPoint),
-                           cast_or_null<Defined>(ctx.hybridSymtab->entry));
+                           cast_or_null<Defined>(ctx.symtab.entry));
 
     // Swap the alternate entry point in the CHPE metadata.
     if (chpeSym)
       ctx.dynamicRelocs->add(
           IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
           Arm64XRelocVal(chpeSym, offsetof(chpe_metadata, AlternateEntryPoint)),
-          cast_or_null<Defined>(ctx.symtab.entry));
+          cast_or_null<Defined>(ctx.hybridSymtab->entry));
   }
 
   if (ctx.symtab.edataStart != ctx.hybridSymtab->edataStart) {
@@ -2735,7 +2736,7 @@ void Writer::createDynamicRelocs() {
                            dataDirOffset64 +
                                EXPORT_TABLE * sizeof(data_directory) +
                                offsetof(data_directory, RelativeVirtualAddress),
-                           ctx.hybridSymtab->edataStart);
+                           ctx.symtab.edataStart);
     // The Size value is assigned after addresses are finalized.
     ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
                            dataDirOffset64 +
@@ -2773,12 +2774,12 @@ void Writer::createDynamicRelocs() {
                          dataDirOffset64 +
                              LOAD_CONFIG_TABLE * sizeof(data_directory) +
                              offsetof(data_directory, RelativeVirtualAddress),
-                         ctx.hybridSymtab->loadConfigSym);
+                         ctx.symtab.loadConfigSym);
   ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t),
                          dataDirOffset64 +
                              LOAD_CONFIG_TABLE * sizeof(data_directory) +
                              offsetof(data_directory, Size),
-                         ctx.hybridSymtab->loadConfigSize);
+                         ctx.symtab.loadConfigSize);
 }
 
 PartialSection *Writer::createPartialSection(StringRef name,
@@ -2889,15 +2890,14 @@ void Writer::prepareLoadConfig(SymbolTable &symtab, T *loadConfig) {
     // On ARM64X, only the EC version of the load config contains
     // CHPEMetadataPointer. Copy its value to the native load config.
     if (ctx.hybridSymtab && !symtab.isEC() &&
-        ctx.hybridSymtab->loadConfigSize >=
+        ctx.symtab.loadConfigSize >=
             offsetof(T, CHPEMetadataPointer) + sizeof(T::CHPEMetadataPointer)) {
       OutputSection *sec =
-          ctx.getOutputSection(ctx.hybridSymtab->loadConfigSym->getChunk());
+          ctx.getOutputSection(ctx.symtab.loadConfigSym->getChunk());
       uint8_t *secBuf = buffer->getBufferStart() + sec->getFileOff();
       auto hybridLoadConfig =
           reinterpret_cast<const coff_load_configuration64 *>(
-              secBuf +
-              (ctx.hybridSymtab->loadConfigSym->getRVA() - sec->getRVA()));
+              secBuf + (ctx.symtab.loadConfigSym->getRVA() - sec->getRVA()));
       loadConfig->CHPEMetadataPointer = hybridLoadConfig->CHPEMetadataPointer;
     }
   }
diff --git a/lld/test/COFF/arm64x-map.s b/lld/test/COFF/arm64x-map.s
new file mode 100644
index 0000000000000..956c52a93b041
--- /dev/null
+++ b/lld/test/COFF/arm64x-map.s
@@ -0,0 +1,60 @@
+// REQUIRES: aarch64
+// RUN: split-file %s %t.dir && cd %t.dir
+
+// RUN: llvm-mc -filetype=obj -triple=aarch64-windows arm64-data-sym.s -o arm64-data-sym.obj
+// RUN: llvm-mc -filetype=obj -triple=arm64ec-windows arm64ec-data-sym.s -o arm64ec-data-sym.obj
+// RUN: lld-link -machine:arm64x -dll -out:out.dll -map -mapinfo:exports arm64-data-sym.obj arm64ec-data-sym.obj
+// RUN: FileCheck %s < out.map
+
+// CHECK:      Start         Length     Name                   Class
+// CHECK-NEXT: 0001:00000000 00001004H .text                   CODE
+// CHECK-NEXT: 0004:00000000 00000008H .data                   DATA
+// CHECK-NEXT: 0004:00000008 00000000H .bss                    DATA
+// CHECK-EMPTY:
+// CHECK-NEXT:  Address         Publics by Value              Rva+Base               Lib:Object
+// CHECK-EMPTY:
+// CHECK-NEXT: 0001:00000000       _DllMainCRTStartup         0000000180001000     arm64-data-sym.obj
+// CHECK-NEXT: 0001:00001000       _DllMainCRTStartup         0000000180002000     arm64ec-data-sym.obj
+// CHECK-NEXT: 0004:00000000       arm64_data_sym             0000000180005000     arm64-data-sym.obj
+// CHECK-NEXT: 0004:00000004       arm64ec_data_sym           0000000180005004     arm64ec-data-sym.obj
+// CHECK-EMPTY:
+// CHECK-NEXT: entry point at         0002:00000000
+// CHECK-EMPTY:
+// CHECK-NEXT: Static symbols
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK-NEXT: Exports
+// CHECK-EMPTY:
+// CHECK-NEXT:  ordinal    name
+// CHECK-EMPTY:
+// CHECK-NEXT:        1    arm64ec_data_sym
+
+#--- arm64ec-data-sym.s
+        .text
+        .globl _DllMainCRTStartup
+_DllMainCRTStartup:
+        ret
+
+        .data
+        .globl arm64ec_data_sym
+        .p2align 2, 0x0
+arm64ec_data_sym:
+        .word 0x02020202
+
+        .section .drectve
+        .ascii "-export:arm64ec_data_sym,DATA"
+
+#--- arm64-data-sym.s
+        .text
+        .globl _DllMainCRTStartup
+_DllMainCRTStartup:
+        ret
+
+        .data
+        .globl arm64_data_sym
+        .p2align 2, 0x0
+arm64_data_sym:
+        .word 0x01010101
+
+        .section .drectve
+        .ascii "-export:arm64_data_sym,DATA"

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, this sounds reasonable. This avoids the cases for mainSymtab which I always felt was a bit unintuitive.

@@ -32,7 +32,7 @@ class COFFLinkerContext : public CommonLinkerContext {
SymbolTable symtab;
COFFOptTable optTable;

// A hybrid ARM64EC symbol table on ARM64X target.
// A native ARM64 symbol table on ARM64X target.
std::optional<SymbolTable> hybridSymtab;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a different name than hybridSymtab would be more accurate after this step? Then again, I guess hybrid means the combination of arm64ec and arm64, so in that sense I guess "hybrid" still is relevant even if we mean the other one. (I.e. "the other symtab, for the cases when we have two".)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been considering using a name like nativeSymtab, but I'm not sure that would be an improvement. On targets other than arm64x (and arm64ec in the future), symtab already refers to the native symbol table, so renaming it could be misleading.

if (hybridSymtab)
f(*hybridSymtab);
f(symtab);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we maintain the strict order that we first deal with the native one, then the arm64ec one. I guess this makes a difference for e.g. the order within sections? (Perhaps this case would need a comment?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, things should generally work if we change the order, but it would result in a different ordering of some chunks, which would be visible. I’ll add a comment, thanks for the review!

Originally, the intent behind symtab was to represent the symbol table seen in
the PE header (without applying ARM64X relocations). However, in most cases
outside of writeHeader(), the code references either both symbol tables or only
the EC one—for example, mainSymtab in linkerMain() maps to hybridSymtab on
ARM64X.

MSVC's link.exe allows pure ARM64EC images to include native ARM64 files. This
patch prepares LLD to support the same, which will require hybridSymtab to be
available even for ARM64EC. At that point, writeHeader() will need to use the
EC symbol table, and the original reasoning for keeping it in hybridSymtab no
longer applies.

Given this, it seems cleaner to treat the EC symbol table as the “main” one,
assigning it to symtab, and use hybridSymtab for the native symbol table
instead. Since writeHeader() will need to be conditional anyway, this change
simplifies the rest of the code by allowing other parts to consistently treat
ctx.symtab as the main symbol table.

As a further simplification, this also allows us to eliminate symtabEC and use
symtab directly; I’ll submit that as a separate PR.

The map file now uses the EC symbol table for printed entry points and exports,
matching MSVC behavior.
@cjacek cjacek merged commit 1c05c61 into llvm:main Apr 11, 2025
6 of 11 checks passed
@cjacek cjacek deleted the arm64x-symtab branch April 11, 2025 13:53
cjacek added a commit to cjacek/llvm-project that referenced this pull request Apr 11, 2025
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 11, 2025

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building lld at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/20026

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/overloaded-functions/TestOverloadedFunctions.py (415 of 2826)
PASS: lldb-api :: lang/cpp/namespace_conflicts/TestNamespaceConflicts.py (416 of 2826)
PASS: lldb-api :: functionalities/plugins/python_os_plugin/TestPythonOSPlugin.py (417 of 2826)
PASS: lldb-api :: commands/platform/process/list/TestProcessList.py (418 of 2826)
PASS: lldb-api :: commands/target/auto-install-main-executable/TestAutoInstallMainExecutable.py (419 of 2826)
PASS: lldb-api :: lang/cpp/chained-calls/TestCppChainedCalls.py (420 of 2826)
PASS: lldb-api :: functionalities/deleted-executable/TestDeletedExecutable.py (421 of 2826)
PASS: lldb-api :: lang/cpp/fpnan/TestFPNaN.py (422 of 2826)
PASS: lldb-unit :: Core/./LLDBCoreTests/59/77 (423 of 2826)
UNRESOLVED: lldb-api :: driver/batch_mode/TestBatchMode.py (424 of 2826)
******************** TEST 'lldb-api :: driver/batch_mode/TestBatchMode.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/driver/batch_mode -p TestBatchMode.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 1c05c6183da836d148eb3f558e8f8e0194f39eed)
  clang revision 1c05c6183da836d148eb3f558e8f8e0194f39eed
  llvm revision 1c05c6183da836d148eb3f558e8f8e0194f39eed
�7�[0;23r�8�[1A
(lldb) settings clear -all
�7
�[7m                                                                                �[0m�8(lldb) settings set symbols.enable-external-lookup false
(lldb) settings set target.inherit-tcc true
(lldb) settings set target.disable-aslr false
(lldb) settings set target.detach-on-error false
(lldb) settings set target.auto-apply-fixits false
(lldb) settings set plugin.process.gdb-remote.packet-timeout 60
(lldb) settings set symbols.clang-modules-cache-path "/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api"
(lldb) settings set use-color false
(lldb) settings set show-statusline false
�7�[0;24r�8�[J(lldb) target create "/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/driver/batch_mode/TestBatchMode.test_batch_mode_attach_exit/a.out"
Current executable set to '/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/driver/batch_mode/TestBatchMode.test_batch_mode_attach_exit/a.out' (x86_64).
(lldb) process attach -p 4007656
Process 4007656 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSTOP
    frame #0: 0x00007f4d5d4ee443 libc.so.6`clock_nanosleep + 35
libc.so.6`clock_nanosleep:
->  0x7f4d5d4ee443 <+35>: negl   %eax
    0x7f4d5d4ee445 <+37>: retq   
    0x7f4d5d4ee446 <+38>: nopw   %cs:(%rax,%rax)
    0x7f4d5d4ee450 <+48>: subq   $0x28, %rsp
(lldb) breakpoint set --file 'main.c' -p 'Stop here to unset keep_waiting' -N keep_waiting
Breakpoint 1: where = a.out`main + 221 at main.c:29:13, address = 0x000055e225e5f25d
(lldb) continue
Process 4007656 resuming

cjacek added a commit that referenced this pull request Apr 11, 2025
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…ges (llvm#135093)

Originally, the intent behind symtab was to represent the symbol table
seen in the PE header (without applying ARM64X relocations). However, in
most cases outside of `writeHeader()`, the code references either both
symbol tables or only the EC one, for example, `mainSymtab` in
`linkerMain()` maps to `hybridSymtab` on ARM64X.

MSVC's link.exe allows pure ARM64EC images to include native ARM64
files. This patch prepares LLD to support the same, which will require
`hybridSymtab` to be available even for ARM64EC. At that point,
`writeHeader()` will need to use the EC symbol table, and the original
reasoning for keeping it in `hybridSymtab` no longer applies.

Given this, it seems cleaner to treat the EC symbol table as the “main”
one, assigning it to `symtab`, and use `hybridSymtab` for the native
symbol table instead. Since `writeHeader()` will need to be conditional
anyway, this change simplifies the rest of the code by allowing other
parts to consistently treat `ctx.symtab` as the main symbol table.

As a further simplification, this also allows us to eliminate `symtabEC`
and use `symtab` directly; I’ll submit that as a separate PR.

The map file now uses the EC symbol table for printed entry points and
exports, matching MSVC behavior.
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants