Skip to content

Reapply [llvm-ar] Use COFF archive format for COFF targets. #82642 with fixes for no symbol table handling. #82898

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Feb 24, 2024

Detect COFF files by default and allow specifying it with --format argument.

This is important for ARM64EC, which uses a separated symbol map for EC symbols. Since K_COFF is mostly compatible with K_GNU, this shouldn't really make a difference for other targets.

This originally landed as #82642, but was reverted due to test failures in tests using no symbol table. Since COFF symbol can't express it, fallback to GNU format in that case.

@llvmbot
Copy link
Member

llvmbot commented Feb 24, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: Jacek Caban (cjacek)

Changes

I noticed it due to a test failure in #82642 in a test that creates an archive with 'S' (no symbol table) option. Reader recognizes coff format from gnu format from presence of the extra symbol map member, so this can't work for archives with no symbol table. Just use gnu format in archive writer in such case.


Full diff: https://github.com/llvm/llvm-project/pull/82898.diff

2 Files Affected:

  • (modified) llvm/lib/Object/ArchiveWriter.cpp (+4-3)
  • (added) llvm/test/tools/llvm-ar/no-symtab.yaml (+32)
diff --git a/llvm/lib/Object/ArchiveWriter.cpp b/llvm/lib/Object/ArchiveWriter.cpp
index 02f72521c8b544..9af2b6e94a270a 100644
--- a/llvm/lib/Object/ArchiveWriter.cpp
+++ b/llvm/lib/Object/ArchiveWriter.cpp
@@ -966,10 +966,12 @@ static Error writeArchiveToStream(raw_ostream &Out,
   SmallString<0> StringTableBuf;
   raw_svector_ostream StringTable(StringTableBuf);
   SymMap SymMap;
+  bool ShouldWriteSymtab = WriteSymtab != SymtabWritingMode::NoSymtab;
 
   // COFF symbol map uses 16-bit indexes, so we can't use it if there are too
-  // many members.
-  if (isCOFFArchive(Kind) && NewMembers.size() > 0xfffe)
+  // many members. COFF format also requires symbol table presence, so use
+  // GNU format when NoSymtab is requested.
+  if (isCOFFArchive(Kind) && (NewMembers.size() > 0xfffe || !ShouldWriteSymtab))
     Kind = object::Archive::K_GNU;
 
   // In the scenario when LLVMContext is populated SymbolicFile will contain a
@@ -998,7 +1000,6 @@ static Error writeArchiveToStream(raw_ostream &Out,
   uint64_t LastMemberHeaderOffset = 0;
   uint64_t NumSyms = 0;
   uint64_t NumSyms32 = 0; // Store symbol number of 32-bit member files.
-  bool ShouldWriteSymtab = WriteSymtab != SymtabWritingMode::NoSymtab;
 
   for (const auto &M : Data) {
     // Record the start of the member's offset
diff --git a/llvm/test/tools/llvm-ar/no-symtab.yaml b/llvm/test/tools/llvm-ar/no-symtab.yaml
new file mode 100644
index 00000000000000..7370c9b3235522
--- /dev/null
+++ b/llvm/test/tools/llvm-ar/no-symtab.yaml
@@ -0,0 +1,32 @@
+## Create archives with no symtab in various formats and check that we can read them.
+
+# RUN: yaml2obj %s -o %t.o
+# RUN: rm -f %t.*.a
+
+# RUN: llvm-ar --format=gnu rcS %t.gnu.a %t.o
+# RUN: llvm-ar --format=coff rcS %t.coff.a %t.o
+# RUN: llvm-ar --format=darwin rcS %t.darwin.a %t.o
+# RUN: llvm-ar --format=bsd rcS %t.bsd.a %t.o
+# RUN: llvm-ar --format=bigarchive rcS %t.bigarchive.a %t.o
+
+# RUN: llvm-nm --print-armap %t.gnu.a | FileCheck %s
+# RUN: llvm-nm --print-armap %t.coff.a | FileCheck %s
+# RUN: llvm-nm --print-armap %t.darwin.a | FileCheck %s
+# RUN: llvm-nm --print-armap %t.bsd.a | FileCheck %s
+# RUN: llvm-nm --print-armap %t.bigarchive.a | FileCheck %s
+
+# CHECK-NOT: Archive map
+
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:    ELFDATA2LSB
+  Type:    ET_REL
+  Machine: EM_X86_64
+Sections:
+  - Name: .text
+    Type: SHT_PROGBITS
+Symbols:
+  - Name:    symbol
+    Binding: STB_GLOBAL
+    Section: .text

@efriedma-quic
Copy link
Collaborator

I was going to ask what MSVC does... but apparently, it specifically forbids creating empty archives (LNK1160), so there's no equivalent.

If someone creates an archive, then add something to it/removes something from it, does llvm-ar/llvm-lib rerun the heuristic for choosing an archive format? If it does, this is probably fine.

@cjacek
Copy link
Contributor Author

cjacek commented Feb 26, 2024

I was going to ask what MSVC does... but apparently, it specifically forbids creating empty archives (LNK1160), so there's no equivalent.

Yes, the closest I could do was adding a file with no symbols, but lib.exe just emits an empty symbol table then.

If someone creates an archive, then add something to it/removes something from it, does llvm-ar/llvm-lib rerun the heuristic for choosing an archive format? If it does, this is probably fine.

With the previous version, we'd then use the COFF format. A similar logic to mitigate it is already there for BSD/DARWIN distinction, so we could just use that. I added a commit that does that. I guarded it with hasSymbolTable() to preserve current behavior in irrelevant cases, but we could potentially skip that.

(This PR now depends on reverted #82642, but should be committed first; I plan to rebase and move things a bit between them before pushing to get it right).

@@ -14,6 +14,12 @@ RUN: yaml2obj elf.yaml -o coff-symtab.o
RUN: llvm-ar crs --format coff out3.a coff-symtab.o
RUN: llvm-nm --print-armap out3.a | FileCheck %s

Create an empty archive with no symbol map, add a COFF file to it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to know what is expected here, e.g. something like "Show that the output archive is a COFF archive."

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please wait for someone else, e.g. @efriedma-quic.

@cjacek
Copy link
Contributor Author

cjacek commented Mar 12, 2024

Looking at this again, it probably makes sense to commit together with relanding #82642. We need parts of that for this PR to be testable, while this PR would need to land first to not break tests with #82642. I force-pushed a commit that contains both (previously reviewed) #82642 and fixes from this PR.

Is it okay to push it like that?

@jh7370
Copy link
Collaborator

jh7370 commented Mar 13, 2024

Sounds sensible to me. Please go ahead (though I'd still prefer someone from the COFF side of things to take a quick look).

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look reasonable to me, so LGTM. But the PR subject+description (which end up as commit message for the new commit) makes this a bit hard to understand - I'd prefer to rephrase it as a copy of the original commit message (which is the main effect of the commit once it lands on main), while the incremental changes from the previous PR just is a minor detail in the big picture. I.e. "Reapply [llvm-ar] Use COFF archive format ...", same commit message as the previous one, but with the story of original PR, revert (and reason for it) and reapplication (with what is changed to remedy the issue) as an extra final paragraph in that commit message.

@cjacek cjacek changed the title [llvm-ar][Object] Use K_GNU instead of K_COFF for archives with no symtab. Reapply [llvm-ar][Object] Use K_GNU instead of K_COFF for archives with no symtab. #82898 with fixes for no symbol table handling. Mar 13, 2024
@cjacek cjacek changed the title Reapply [llvm-ar][Object] Use K_GNU instead of K_COFF for archives with no symtab. #82898 with fixes for no symbol table handling. Reapply [llvm-ar] Use COFF archive format for COFF targets. #82642 with fixes for no symbol table handling. Mar 13, 2024
Detect COFF files by default and allow specifying it with --format
argument.

This is important for ARM64EC, which uses a separated symbol map for EC
symbols. Since K_COFF is mostly compatible with K_GNU, this shouldn't
really make a difference for other targets.

This originally landed as llvm#82642, but was reverted due to test failures
in tests using no symbol table. Since COFF symbol can't express it,
fallback to GNU format in that case.
@cjacek cjacek merged commit 2037577 into llvm:main Mar 13, 2024
@cjacek cjacek deleted the ar-no-symtab branch March 13, 2024 12:27
@jakeegan
Copy link
Member

Hi, this test is failing on AIX, could you take a look?
https://lab.llvm.org/buildbot/#/builders/214/builds/11580/steps/6/logs/FAIL__LLVM__coff-symtab_test

cjacek added a commit to cjacek/llvm-project that referenced this pull request Mar 13, 2024
cjacek added a commit that referenced this pull request Mar 13, 2024
@cjacek
Copy link
Contributor Author

cjacek commented Mar 13, 2024

The test accidentally depended on default archive format. I pushed a fix in #85112.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants