Skip to content

[lld-macho] Omit __llvm_addrsig metadata from the output #98913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 16, 2024

Conversation

BertalanD
Copy link
Member

This section contains metadata that's only relevant for Identical Code Folding at link time, we should not include it in the output.

We still treat it like a regular section during input file parsing (e.g. create a ConcatInputSection for it), as we want its relocations to be parsed. But it should not be passed to addInputSection, as that's what assigns it to an OutputSection and adds it to the inputSections vector which specifies the inputs to dead-stripping and relocation scanning.

This fixes a "__DATA,__llvm_addrsig, offset 0: fixups overlap" error when using --icf=safe alongside -fixup_chains. This occurs because all __llvm_addrsig sections are 8 bytes large, and the relocations which signify functions whose addresses are taken are all at offset 0.

This makes the fix in 5fa24ac ("Category Merger: add support for addrsig references") obsolete, as we no longer try to resolve symbols referenced in __llvm_addrsig when writing the output file. When we do iterate its relocations in markAddrSigSymbols, we do not try to resolve their addresses.


@alx32 I believe this PR has the same effect as #90903, can you check if the code that prompted your PR still links?

This section contains metadata that's only relevant for Identical Code
Folding at link time, we should not include it in the output.

We still treat it like a regular section during input file parsing (e.g.
create a `ConcatInputSection` for it), as we want its relocations to be
parsed. But it should not be passed to `addInputSection`, as that's what
assigns it to an `OutputSection` and adds it to the `inputSections`
vector which specifies the inputs to dead-stripping and relocation
scanning.

This fixes a "__DATA,__llvm_addrsig, offset 0: fixups overlap" error
when using `--icf=safe` alongside `-fixup_chains`. This occurs because
all `__llvm_addrsig` sections are 8 bytes large, and the relocations
which signify functions whose addresses are taken are all at offset 0.

This makes the fix in 5fa24ac ("Category Merger: add support for
addrsig references") obsolete, as we no longer try to resolve symbols
referenced in `__llvm_addrsig` when writing the output file. When we do
iterate its relocations in `markAddrSigSymbols`, we do not try to
resolve their addresses.
@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2024

@llvm/pr-subscribers-lld-macho

@llvm/pr-subscribers-lld

Author: Daniel Bertalan (BertalanD)

Changes

This section contains metadata that's only relevant for Identical Code Folding at link time, we should not include it in the output.

We still treat it like a regular section during input file parsing (e.g. create a ConcatInputSection for it), as we want its relocations to be parsed. But it should not be passed to addInputSection, as that's what assigns it to an OutputSection and adds it to the inputSections vector which specifies the inputs to dead-stripping and relocation scanning.

This fixes a "__DATA,__llvm_addrsig, offset 0: fixups overlap" error when using --icf=safe alongside -fixup_chains. This occurs because all __llvm_addrsig sections are 8 bytes large, and the relocations which signify functions whose addresses are taken are all at offset 0.

This makes the fix in 5fa24ac ("Category Merger: add support for addrsig references") obsolete, as we no longer try to resolve symbols referenced in __llvm_addrsig when writing the output file. When we do iterate its relocations in markAddrSigSymbols, we do not try to resolve their addresses.


@alx32 I believe this PR has the same effect as #90903, can you check if the code that prompted your PR still links?


Full diff: https://github.com/llvm/llvm-project/pull/98913.diff

4 Files Affected:

  • (modified) lld/MachO/Driver.cpp (+3)
  • (modified) lld/MachO/ObjC.cpp (-32)
  • (modified) lld/test/MachO/dead-strip.s (+24)
  • (modified) lld/test/MachO/icf-safe.ll (+7-4)
diff --git a/lld/MachO/Driver.cpp b/lld/MachO/Driver.cpp
index 28c28f29defd1..a370d5734124a 100644
--- a/lld/MachO/Driver.cpp
+++ b/lld/MachO/Driver.cpp
@@ -1247,6 +1247,9 @@ static void gatherInputSections() {
       // contrast, EH frames are handled like regular ConcatInputSections.)
       if (section->name == section_names::compactUnwind)
         continue;
+      // Addrsig sections contain metadata only needed at link time.
+      if (section->name == section_names::addrSig)
+        continue;
       for (const Subsection &subsection : section->subsections)
         addInputSection(subsection.isec);
     }
diff --git a/lld/MachO/ObjC.cpp b/lld/MachO/ObjC.cpp
index 740ebaf7e0403..4a6f99654ba13 100644
--- a/lld/MachO/ObjC.cpp
+++ b/lld/MachO/ObjC.cpp
@@ -449,7 +449,6 @@ class ObjcCategoryMerger {
   mergeCategoriesIntoSingleCategory(std::vector<InfoInputCategory> &categories);
 
   void eraseISec(ConcatInputSection *isec);
-  void removeRefsToErasedIsecs();
   void eraseMergedCategories();
 
   void generateCatListForNonErasedCategories(
@@ -519,8 +518,6 @@ class ObjcCategoryMerger {
   std::vector<ConcatInputSection *> &allInputSections;
   // Map of base class Symbol to list of InfoInputCategory's for it
   MapVector<const Symbol *, std::vector<InfoInputCategory>> categoryMap;
-  // Set for tracking InputSection erased via eraseISec
-  DenseSet<InputSection *> erasedIsecs;
 
   // Normally, the binary data comes from the input files, but since we're
   // generating binary data ourselves, we use the below array to store it in.
@@ -1272,8 +1269,6 @@ void ObjcCategoryMerger::generateCatListForNonErasedCategories(
 }
 
 void ObjcCategoryMerger::eraseISec(ConcatInputSection *isec) {
-  erasedIsecs.insert(isec);
-
   isec->live = false;
   for (auto &sym : isec->symbols)
     sym->used = false;
@@ -1326,33 +1321,6 @@ void ObjcCategoryMerger::eraseMergedCategories() {
                                   catLayout.instancePropsOffset);
     }
   }
-
-  removeRefsToErasedIsecs();
-}
-
-// The compiler may generate references to categories inside the addrsig
-// section. This function will erase these references.
-void ObjcCategoryMerger::removeRefsToErasedIsecs() {
-  for (InputSection *isec : inputSections) {
-    if (isec->getName() != section_names::addrSig)
-      continue;
-
-    auto removeRelocs = [this](Reloc &r) {
-      auto *isec = dyn_cast_or_null<ConcatInputSection>(
-          r.referent.dyn_cast<InputSection *>());
-      if (!isec) {
-        Defined *sym =
-            dyn_cast_or_null<Defined>(r.referent.dyn_cast<Symbol *>());
-        if (sym)
-          isec = dyn_cast<ConcatInputSection>(sym->isec());
-      }
-      if (!isec)
-        return false;
-      return erasedIsecs.count(isec) > 0;
-    };
-
-    llvm::erase_if(isec->relocs, removeRelocs);
-  }
 }
 
 void ObjcCategoryMerger::doMerge() {
diff --git a/lld/test/MachO/dead-strip.s b/lld/test/MachO/dead-strip.s
index f593b69843ba6..d107dad53a3c5 100644
--- a/lld/test/MachO/dead-strip.s
+++ b/lld/test/MachO/dead-strip.s
@@ -329,6 +329,17 @@
 # LIT-NEXT: Contents of (__TEXT,__literals) section
 # LIT-NEXT: ef be ad de {{$}}
 
+## Ensure that addrsig metadata does not keep unreferenced functions alive.
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos \
+# RUN:     %t/addrsig.s -o %t/addrsig.o
+# RUN: %lld -lSystem -dead_strip --icf=safe %t/addrsig.o -o %t/addrsig
+# RUN: llvm-objdump --syms %t/addrsig | \
+# RUN:     FileCheck --check-prefix=ADDSIG --implicit-check-not _addrsig %s
+# ADDSIG-LABEL: SYMBOL TABLE:
+# ADDSIG-NEXT:   g F __TEXT,__text _main
+# ADDSIG-NEXT:   g F __TEXT,__text __mh_execute_header
+# ADDSIG-NEXT:   *UND* dyld_stub_binder
+
 ## Duplicate symbols that will be dead stripped later should not fail when using
 ## the --dead-stripped-duplicates flag
 # RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos \
@@ -988,3 +999,16 @@ _more_data:
 _main:
   callq _ref_undef_fun
 .subsections_via_symbols
+
+#--- addrsig.s
+.globl _main, _addrsig
+_main:
+  retq
+
+_addrsig:
+  retq
+
+.subsections_via_symbols
+
+.addrsig
+.addrsig_sym _addrsig
diff --git a/lld/test/MachO/icf-safe.ll b/lld/test/MachO/icf-safe.ll
index 71c6f9f7ddac8..7012de56e6938 100644
--- a/lld/test/MachO/icf-safe.ll
+++ b/lld/test/MachO/icf-safe.ll
@@ -5,14 +5,14 @@
 ; RUN: llc -filetype=obj %s -O3 -o %t/icf-obj.o -enable-machine-outliner=never -mtriple arm64-apple-macos -addrsig
 ; RUN: %lld -arch arm64 -lSystem --icf=safe -dylib -o %t/icf-safe.dylib %t/icf-obj.o
 ; RUN: %lld -arch arm64 -lSystem --icf=all  -dylib -o %t/icf-all.dylib  %t/icf-obj.o
-; RUN: llvm-objdump %t/icf-safe.dylib -d --macho | FileCheck %s --check-prefix=ICFSAFE
-; RUN: llvm-objdump %t/icf-all.dylib  -d --macho | FileCheck %s --check-prefix=ICFALL
+; RUN: llvm-objdump %t/icf-safe.dylib -d -h --macho | FileCheck %s --check-prefixes=ICFSAFE,CHECK
+; RUN: llvm-objdump %t/icf-all.dylib  -d -h --macho | FileCheck %s --check-prefixes=ICFALL,CHECK
 
 ; RUN: llvm-as %s -o %t/icf-bitcode.o
 ; RUN: %lld -arch arm64 -lSystem --icf=safe -dylib -o %t/icf-safe-bitcode.dylib %t/icf-bitcode.o
 ; RUN: %lld -arch arm64 -lSystem --icf=all  -dylib -o %t/icf-all-bitcode.dylib %t/icf-bitcode.o
-; RUN: llvm-objdump %t/icf-safe-bitcode.dylib -d --macho | FileCheck %s --check-prefix=ICFSAFE
-; RUN: llvm-objdump %t/icf-all-bitcode.dylib  -d --macho | FileCheck %s --check-prefix=ICFALL
+; RUN: llvm-objdump %t/icf-safe-bitcode.dylib -d -h --macho | FileCheck %s --check-prefixes=ICFSAFE,CHECK
+; RUN: llvm-objdump %t/icf-all-bitcode.dylib  -d -h --macho | FileCheck %s --check-prefixes=ICFALL,CHECK
 
 ; ICFSAFE-LABEL:  _callAllFunctions
 ; ICFSAFE:        bl _func02
@@ -24,6 +24,9 @@
 ; ICFALL-NEXT:    bl _func03_takeaddr
 ; ICFALL-NEXT:    bl _func03_takeaddr
 
+; CHECK-LABEL: Sections:
+; CHECK-NOT:   __llvm_addrsig
+
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 target triple = "arm64-apple-macos11.0"
 

@BertalanD BertalanD requested a review from alx32 July 15, 2024 14:45
@alx32
Copy link
Contributor

alx32 commented Jul 16, 2024

I've tested the link unit that previously needed #90903 to work and it still links after this PR.

Related: I was planning to publish in the next few days another PR that depends on the existing code ( added in #90903) - basically the same logic of removing references to erased items, but this time for the _constg_swiftt section - not for __llvm_addrsig.

Unlinke __llvm_addrsig I don't think we can use the same approach for _constg_swiftt that this PR is using for __llvm_addrsig (not calling addInputSection).

@BertalanD - Should we just remove the removeRefsToErasedIsecs (and related) code for now and I'll re-add it later if needed ? Or should we leave it for now and I'll re-use with the _constg_swiftt change - or remove it in 1-2 weeks if the _constg_swiftt change ends up not needing it ?

@BertalanD
Copy link
Member Author

BertalanD commented Jul 16, 2024

Thank you for checking!

I would prefer merging this as-is, as removeRefsToErasedIsecs would otherwise be dead code until your PR is merged. It won't be too much work to add it back if it turns out necessary.

(I'm a bit skeptical about it: does the swift runtime expect that the pointers with the erased relocs might be null? What does ld do?)

Sidenote: I tried looking into what the _constg_swiftt could be about. Didn't get far, because enabling category merging and relative method lists together crashes LLD. Will file a bug about it.

@alx32
Copy link
Contributor

alx32 commented Jul 16, 2024

I would prefer merging this as-is

Sounds good.

[...] the pointers with the erased relocs might be null? What does ld do?

In limited testing it worked for me - I haven't yet looked into ld64 - that change is still work in progress.

@BertalanD BertalanD merged commit 6ad2987 into llvm:main Jul 16, 2024
5 of 6 checks passed
@BertalanD BertalanD deleted the addrsig-output branch July 16, 2024 22:41
BertalanD added a commit to BertalanD/llvm-project that referenced this pull request Jul 16, 2024
This reverts commit f55b79f.

The known issues with chained fixups have been addressed by llvm#98913,
llvm#98305, llvm#97156 and llvm#95171.

Compared to the original commit, support for xrOS (which postdates
chained fixups' introduction) was added and an unnecessary test change
was removed.

----------
Original commit message:

Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:

https://github.com/apple-oss-distributions/ld64/blob/93d74eafc37c0558b4ffb88a8bc15c17bed44a20/src/ld/Options.cpp#L5458-L5549

Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32
BertalanD added a commit that referenced this pull request Jul 22, 2024
This reverts commit f55b79f.

The known issues with chained fixups have been addressed by #98913,
#98305, #97156 and #95171.

Compared to the original commit, support for xrOS (which postdates
chained fixups' introduction) was added and an unnecessary test change
was removed.

----------
Original commit message:

Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:


https://github.com/apple-oss-distributions/ld64/blob/93d74eafc37c0558b4ffb88a8bc15c17bed44a20/src/ld/Options.cpp#L5458-L5549

Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
This section contains metadata that's only relevant for Identical Code
Folding at link time, we should not include it in the output.

We still treat it like a regular section during input file parsing (e.g.
create a `ConcatInputSection` for it), as we want its relocations to be
parsed. But it should not be passed to `addInputSection`, as that's what
assigns it to an `OutputSection` and adds it to the `inputSections`
vector which specifies the inputs to dead-stripping and relocation
scanning.

This fixes a "__DATA,__llvm_addrsig, offset 0: fixups overlap" error
when using `--icf=safe` alongside `-fixup_chains`. This occurs because
all `__llvm_addrsig` sections are 8 bytes large, and the relocations
which signify functions whose addresses are taken are all at offset 0.

This makes the fix in 5fa24ac ("Category Merger: add support for
addrsig references") obsolete, as we no longer try to resolve symbols
referenced in `__llvm_addrsig` when writing the output file. When we do
iterate its relocations in `markAddrSigSymbols`, we do not try to
resolve their addresses.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250821
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
This reverts commit f55b79f.

The known issues with chained fixups have been addressed by #98913,
#98305, #97156 and #95171.

Compared to the original commit, support for xrOS (which postdates
chained fixups' introduction) was added and an unnecessary test change
was removed.

----------
Original commit message:

Enable chained fixups in lld when all platform and version criteria are
met. This is an attempt at simplifying the logic used in ld 907:


https://github.com/apple-oss-distributions/ld64/blob/93d74eafc37c0558b4ffb88a8bc15c17bed44a20/src/ld/Options.cpp#L5458-L5549

Some changes were made to simplify the logic:
- only enable chained fixups for macOS from 13.0 to avoid the arch check
- only enable chained fixups for iphonesimulator from 16.0 to avoid the
arch check
- don't enable chained fixups for not specifically listed platforms
- don't enable chained fixups for arm64_32

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251071
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants