Skip to content

Commit 0557b1b

Browse files
committed
[ELF] Resolve defined symbols before undefined symbols
When parsing an object file, LLD interleaves undefined symbol resolution (which may recursively fetch other lazy objects) with defined symbol resolution. This may lead to surprising results, e.g. if an object file defines currently undefined symbols and references another lazy symbol, we may interleave defined symbols with the lazy fetch, potentially leading to the defined symbols resolving to different files. As an example, if both `a.a(a.o)` and `a.a(b.o)` define `foo` (not in COMDAT group, or in different COMDAT groups) and `__profd_foo` (in COMDAT group `__profd_foo`). LLD may resolve `foo` to `a.a(a.o)` and `__profd_foo` to `b.a(b.o)`, i.e. different files. ``` parse ArchiveFile a.a entry fetches a.a(a.o) parse ObjectFile a.o define entry define foo reference b b fetches a.a(b.o) parse ObjectFile b.o define prevailing __profd_foo define (ignored) non-prevailing __profd_foo ``` Assuming a set of interconnected symbols are defined all or none in several lazy objects. Arguably making them resolve to the same file is preferable than making them resolve to different files (some are lazy objects). The main argument favoring the new behavior is the stability. The relative order between a defined symbol and an undefined symbol does not change the symbol resolution behavior. Only the relative order between two undefined symbols can affect fetching behaviors. --- The real world case is reduced from a Fuchsia PGO usage: `a.a(a.o)` has a constructor within COMDAT group C5 while `a.a(b.o)` has a constructor within COMDAT group C2. Because they use different group signatures, they are not de-duplicated. It is not entirely whether Clang behavior is entirely conforming. LLD selects the PGO counter section (`__profd_*`) from `a.a(b.o)` and the constructor section from `a.a(a.o)`. The `__profd_*` is a SHF_LINK_ORDER section linking to its own non-prevailing constructor section, so LLD errors `sh_link points to discarded section`. This patch fixes the error. Differential Revision: https://reviews.llvm.org/D95985
1 parent 18d38b2 commit 0557b1b

File tree

2 files changed

+58
-2
lines changed

2 files changed

+58
-2
lines changed

lld/ELF/InputFiles.cpp

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1138,6 +1138,7 @@ template <class ELFT> void ObjFile<ELFT>::initializeSymbols() {
11381138
}
11391139

11401140
// Symbol resolution of non-local symbols.
1141+
SmallVector<unsigned, 32> unds;
11411142
for (size_t i = firstGlobal, end = eSyms.size(); i != end; ++i) {
11421143
const Elf_Sym &eSym = eSyms[i];
11431144
uint8_t binding = eSym.getBinding();
@@ -1154,8 +1155,7 @@ template <class ELFT> void ObjFile<ELFT>::initializeSymbols() {
11541155

11551156
// Handle global undefined symbols.
11561157
if (eSym.st_shndx == SHN_UNDEF) {
1157-
this->symbols[i]->resolve(Undefined{this, name, binding, stOther, type});
1158-
this->symbols[i]->referenced = true;
1158+
unds.push_back(i);
11591159
continue;
11601160
}
11611161

@@ -1202,6 +1202,20 @@ template <class ELFT> void ObjFile<ELFT>::initializeSymbols() {
12021202

12031203
fatal(toString(this) + ": unexpected binding: " + Twine((int)binding));
12041204
}
1205+
1206+
// Undefined symbols (excluding those defined relative to non-prevailing
1207+
// sections) can trigger recursive fetch. Process defined symbols first so
1208+
// that the relative order between a defined symbol and an undefined symbol
1209+
// does not change the symbol resolution behavior. In addition, a set of
1210+
// interconnected symbols will all be resolved to the same file, instead of
1211+
// being resolved to different files.
1212+
for (unsigned i : unds) {
1213+
const Elf_Sym &eSym = eSyms[i];
1214+
StringRefZ name = this->stringTable.data() + eSym.st_name;
1215+
this->symbols[i]->resolve(Undefined{this, name, eSym.getBinding(),
1216+
eSym.st_other, eSym.getType()});
1217+
this->symbols[i]->referenced = true;
1218+
}
12051219
}
12061220

12071221
ArchiveFile::ArchiveFile(std::unique_ptr<Archive> &&file)

lld/test/ELF/interconnected-lazy.s

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# REQUIRES: x86
2+
3+
# RUN: split-file %s %t
4+
# RUN: llvm-mc -filetype=obj -triple=x86_64 %t/main.s -o %t/main.o
5+
# RUN: llvm-mc -filetype=obj -triple=x86_64 %t/a.s -o %t/a.o
6+
# RUN: llvm-mc -filetype=obj -triple=x86_64 %t/b.s -o %t/b.o
7+
8+
## foo and __foo are interconnected and defined in two lazy object files.
9+
## Test we resolve both to the same file.
10+
# RUN: ld.lld -y a -y foo -y __foo %t/main.o --start-lib %t/a.o %t/b.o --end-lib -o /dev/null | FileCheck %s
11+
12+
# CHECK: a.o: lazy definition of __foo
13+
# CHECK-NEXT: a.o: lazy definition of a
14+
# CHECK-NEXT: a.o: lazy definition of foo
15+
# CHECK-NEXT: b.o: definition of __foo
16+
# CHECK-NEXT: b.o: definition of foo
17+
# CHECK-NEXT: b.o: reference to a
18+
# CHECK-NEXT: a.o: definition of a
19+
20+
#--- main.s
21+
.globl _start
22+
_start:
23+
call b
24+
25+
#--- a.s
26+
.globl a
27+
.weak foo
28+
a:
29+
foo:
30+
31+
.weak __foo
32+
__foo:
33+
34+
#--- b.s
35+
.globl b
36+
.weak foo
37+
b:
38+
call a
39+
foo:
40+
41+
.weak __foo
42+
__foo:

0 commit comments

Comments
 (0)