-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lld-macho] Fix duplicate GOT entries for personality functions #95054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As stated in `UnwindInfoSectionImpl::prepareRelocations`'s comments, the unwind info uses section+addend relocations for personality functions defined in the same file as the function itself. As personality functions are always accessed via the GOT, we need to resolve those to a symbol. Previously, we did this by keeping a map which resolves these to symbols, creating a synthetic symbol if we didn't find it in the map. This approach has an issue: if we process the object file containing the personality function before any external uses, the entry in the map remains unpopulated, so we create a synthetic symbol and a corresponding GOT entry. If we encounter a relocation to it in a later file which requires GOT (such as in `__eh_frame`), we add that symbol to the GOT, too, effectively creating two entries which point to the same piece of code. This commit fixes that by searching the personality function's section for a symbol at that offset which already has a GOT entry, and only creating a synthetic symbol if there is none. As all non-unwind sections are already processed by this point, it ensures no duplication. This should only really affect our tests (and make them clearer), as personality functions are usually defined in platform runtime libraries.
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-macho Author: Daniel Bertalan (BertalanD) ChangesAs stated in This approach has an issue: if we process the object file containing the personality function before any external uses, the entry in the map remains unpopulated, so we create a synthetic symbol and a corresponding GOT entry. If we encounter a relocation to it in a later file which requires GOT (such as in This commit fixes that by searching the personality function's section for a symbol at that offset which already has a GOT entry, and only creating a synthetic symbol if there is none. As all non-unwind sections are already processed by this point, it ensures no duplication. This should only really affect our tests (and make them clearer), as personality functions are usually defined in platform runtime libraries. Full diff: https://github.com/llvm/llvm-project/pull/95054.diff 3 Files Affected:
diff --git a/lld/MachO/UnwindInfoSection.cpp b/lld/MachO/UnwindInfoSection.cpp
index 0ac2f39a6180c..7033481d6014b 100644
--- a/lld/MachO/UnwindInfoSection.cpp
+++ b/lld/MachO/UnwindInfoSection.cpp
@@ -298,19 +298,31 @@ void UnwindInfoSectionImpl::prepareRelocations(ConcatInputSection *isec) {
assert(!isCoalescedWeak(referentIsec));
// Personality functions can be referenced via section relocations
// if they live in the same object file. Create placeholder synthetic
- // symbols for them in the GOT.
+ // symbols for them in the GOT. If the corresponding symbol is already
+ // in the GOT, use that to avoid creating a duplicate entry. All GOT
+ // entries needed by non-unwind sections will have already been added
+ // by this point.
Symbol *&s = personalityTable[{referentIsec, r.addend}];
if (s == nullptr) {
- // This runs after dead stripping, so the noDeadStrip argument does not
- // matter.
- s = make<Defined>("<internal>", /*file=*/nullptr, referentIsec,
- r.addend, /*size=*/0, /*isWeakDef=*/false,
- /*isExternal=*/false, /*isPrivateExtern=*/false,
- /*includeInSymtab=*/true,
- /*isReferencedDynamically=*/false,
- /*noDeadStrip=*/false);
- s->used = true;
- in.got->addEntry(s);
+ Defined *const *gotEntry =
+ llvm::find_if(referentIsec->symbols, [&](Defined const *d) {
+ return d->value == static_cast<uint64_t>(r.addend) &&
+ d->isInGot();
+ });
+ if (gotEntry != referentIsec->symbols.end()) {
+ s = *gotEntry;
+ } else {
+ // This runs after dead stripping, so the noDeadStrip argument does
+ // not matter.
+ s = make<Defined>("<internal>", /*file=*/nullptr, referentIsec,
+ r.addend, /*size=*/0, /*isWeakDef=*/false,
+ /*isExternal=*/false, /*isPrivateExtern=*/false,
+ /*includeInSymtab=*/true,
+ /*isReferencedDynamically=*/false,
+ /*noDeadStrip=*/false);
+ s->used = true;
+ in.got->addEntry(s);
+ }
}
r.referent = s;
r.addend = 0;
diff --git a/lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s b/lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
index 676577d6b17e9..35f39ba5fb1e2 100644
--- a/lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
+++ b/lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
@@ -42,19 +42,20 @@
# RUN: llvm-objdump --macho --indirect-symbols --unwind-info --bind %t/d.out | FileCheck %s --check-prefixes=D -D#%x,OFF=0x100000000
-# A: Indirect symbols for (__DATA_CONST,__got)
+# A: Indirect symbols for (__DATA_CONST,__got) 4 entries
# A-NEXT: address index name
# A: 0x[[#%x,GXX_PERSONALITY_LO:]] [[#]] ___gxx_personality_v0
+# A: 0x[[#%x,PERSONALITY_1:]] [[#]] _personality_1
+# A: 0x[[#%x,PERSONALITY_2:]] [[#]] _personality_2
# A: 0x[[#%x,GXX_PERSONALITY_HI:]] [[#]] ___gxx_personality_v0
-# A: 0x[[#%x,PERSONALITY_1:]] LOCAL
-# A: 0x[[#%x,PERSONALITY_2:]] LOCAL
# BC: Indirect symbols for (__DATA_CONST,__got)
# BC-NEXT: address index name
-# C: 0x[[#%x,GXX_PERSONALITY_HI:]] LOCAL
# BC: 0x[[#%x,GXX_PERSONALITY_LO:]] LOCAL
-# BC: 0x[[#%x,PERSONALITY_1:]] LOCAL
-# BC: 0x[[#%x,PERSONALITY_2:]] LOCAL
+# C: 0x[[#%x,GXX_PERSONALITY_HI:]] [[#]] ___gxx_personality_v0
+# BC: 0x[[#%x,PERSONALITY_1:]] [[#]] _personality_1
+# BC: 0x[[#%x,PERSONALITY_2:]] [[#]] _personality_2
+# BC-EMPTY:
# CHECK: Personality functions: (count = 3)
# CHECK-DAG: personality[{{[0-9]+}}]: 0x{{0*}}[[#GXX_PERSONALITY_LO-OFF]]
@@ -66,7 +67,7 @@
# A-NEXT: __DATA_CONST __got 0x[[#GXX_PERSONALITY_LO-0]] pointer 0 libc++abi ___gxx_personality_v0
-# D: Indirect symbols for (__DATA_CONST,__got)
+# D: Indirect symbols for (__DATA_CONST,__got) 6 entries
# D-NEXT: address index name
# D: 0x[[#%x,GXX_PERSONALITY_HI:]] [[#]] ___gxx_personality_v0
# D: 0x[[#%x,PERSONALITY_1:]] [[#]] _personality_1
diff --git a/lld/test/MachO/compact-unwind.s b/lld/test/MachO/compact-unwind.s
index fa73ccb10a32a..27e4b44dc0b09 100644
--- a/lld/test/MachO/compact-unwind.s
+++ b/lld/test/MachO/compact-unwind.s
@@ -29,7 +29,7 @@
# FIRST: Indirect symbols for (__DATA_CONST,__got)
# FIRST-NEXT: address index name
# FIRST-DAG: 0x[[#%x,GXX_PERSONALITY:]] [[#]] ___gxx_personality_v0
-# FIRST-DAG: 0x[[#%x,MY_PERSONALITY:]] LOCAL
+# FIRST-DAG: 0x[[#%x,MY_PERSONALITY:]]
# SECOND: Indirect symbols for (__DATA_CONST,__got)
# SECOND-NEXT: address index name
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, thanks!
@oontvoo has looked at personality functions in lld in the past, and @speednoisemovement is generally looking at lld things, so cc'ing them for their information. Feel free to hit "merge" in a few hours though :)
Just wanted to point out that, we have seen a few use-cases that involve custom personality symbols (IOWs., they're user-defined symbols and NOT in the runtime libraries) . In fact, the The change itself LGTM, though. |
As stated in
UnwindInfoSectionImpl::prepareRelocations
's comments, the unwind info uses section+addend relocations for personality functions defined in the same file as the function itself. As personality functions are always accessed via the GOT, we need to resolve those to a symbol. Previously, we did this by keeping a map which resolves these to symbols, creating a synthetic symbol if we didn't find it in the map.This approach has an issue: if we process the object file containing the personality function before any external uses, the entry in the map remains unpopulated, so we create a synthetic symbol and a corresponding GOT entry. If we encounter a relocation to it in a later file which requires GOT (such as in
__eh_frame
), we add that symbol to the GOT, too, effectively creating two entries which point to the same piece of code.This commit fixes that by searching the personality function's section for a symbol at that offset which already has a GOT entry, and only creating a synthetic symbol if there is none. As all non-unwind sections are already processed by this point, it ensures no duplication.
This should only really affect our tests (and make them clearer), as personality functions are usually defined in platform runtime libraries.