Skip to content

Commit 334a576

Browse files
authored
[llvm-objcopy] Add support of symbol modification flags for MachO (#120895)
This patch adds support of the following llvm-objcopy flags for MachO: - `--globalize-symbol`, `--globalize-symbols`, - `--keep-global-symbol`, `-G`, `--keep-global-symbols`, - `--localize-symbol`, `-L`, `--localize-symbols`, - `--skip-symbol`, `--skip-symbols`. Code in `updateAndRemoveSymbols` for MachO is kept similar to its version for ELF. Fixes #120894
1 parent 8e1cb96 commit 334a576

File tree

10 files changed

+659
-62
lines changed

10 files changed

+659
-62
lines changed

llvm/docs/CommandGuide/llvm-objcopy.rst

Lines changed: 50 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -78,10 +78,47 @@ multiple file formats.
7878
Enable deterministic mode when copying archives, i.e. use 0 for archive member
7979
header UIDs, GIDs and timestamp fields. On by default.
8080

81+
.. option:: --globalize-symbol <symbol>
82+
83+
Mark any defined symbols named ``<symbol>`` as global symbols in the output.
84+
Can be specified multiple times to mark multiple symbols.
85+
86+
.. option:: --globalize-symbols <filename>
87+
88+
Read a list of names from the file ``<filename>`` and mark defined symbols with
89+
those names as global in the output. In the file, each line represents a single
90+
symbol, with leading and trailing whitespace ignored, as is anything following
91+
a '#'. Can be specified multiple times to read names from multiple files.
92+
8193
.. option:: --help, -h
8294

8395
Print a summary of command line options.
8496

97+
.. option:: --keep-global-symbol <symbol>, -G
98+
99+
Mark all symbols local in the output, except for symbols with the name
100+
``<symbol>``. Can be specified multiple times to ignore multiple symbols.
101+
102+
.. option:: --keep-global-symbols <filename>
103+
104+
Mark all symbols local in the output, except for symbols named in the file
105+
``<filename>``. In the file, each line represents a single symbol, with leading
106+
and trailing whitespace ignored, as is anything following a '#'. Can be
107+
specified multiple times to read names from multiple files.
108+
109+
.. option:: --localize-symbol <symbol>, -L
110+
111+
Mark any defined non-common symbol named ``<symbol>`` as a local symbol in the
112+
output. Can be specified multiple times to mark multiple symbols as local.
113+
114+
.. option:: --localize-symbols <filename>
115+
116+
Read a list of names from the file ``<filename>`` and mark defined non-common
117+
symbols with those names as local in the output. In the file, each line
118+
represents a single symbol, with leading and trailing whitespace ignored, as is
119+
anything following a '#'. Can be specified multiple times to read names from
120+
multiple files.
121+
85122
.. option:: --only-keep-debug
86123

87124
Produce a debug file as the output that only preserves contents of sections
@@ -177,6 +214,19 @@ multiple file formats.
177214
flags.
178215
- `share` = add the `IMAGE_SCN_MEM_SHARED` and `IMAGE_SCN_MEM_READ` flags.
179216

217+
.. option:: --skip-symbol <symbol>
218+
219+
Do not change the parameters of symbol ``<symbol>`` when executing other
220+
options that can change the symbol's name, binding or visibility.
221+
222+
.. option:: --skip-symbols <filename>
223+
224+
Do not change the parameters of symbols named in the file ``<filename>`` when
225+
executing other options that can change the symbol's name, binding or
226+
visibility. In the file, each line represents a single symbol, with leading
227+
and trailing whitespace ignored, as is anything following a '#'.
228+
Can be specified multiple times to read names from multiple files.
229+
180230
.. option:: --strip-all-gnu
181231

182232
Remove all symbols, debug sections and relocations from the output. This option
@@ -355,18 +405,6 @@ them.
355405
For binary outputs, fill the gaps between sections with ``<value>`` instead
356406
of zero. The value must be an unsigned 8-bit integer.
357407

358-
.. option:: --globalize-symbol <symbol>
359-
360-
Mark any defined symbols named ``<symbol>`` as global symbols in the output.
361-
Can be specified multiple times to mark multiple symbols.
362-
363-
.. option:: --globalize-symbols <filename>
364-
365-
Read a list of names from the file ``<filename>`` and mark defined symbols with
366-
those names as global in the output. In the file, each line represents a single
367-
symbol, with leading and trailing whitespace ignored, as is anything following
368-
a '#'. Can be specified multiple times to read names from multiple files.
369-
370408
.. option:: --input-target <format>, -I
371409

372410
Read the input as the specified format. See `SUPPORTED FORMATS`_ for a list of
@@ -377,18 +415,6 @@ them.
377415

378416
Keep symbols of type `STT_FILE`, even if they would otherwise be stripped.
379417

380-
.. option:: --keep-global-symbol <symbol>, -G
381-
382-
Mark all symbols local in the output, except for symbols with the name
383-
``<symbol>``. Can be specified multiple times to ignore multiple symbols.
384-
385-
.. option:: --keep-global-symbols <filename>
386-
387-
Mark all symbols local in the output, except for symbols named in the file
388-
``<filename>``. In the file, each line represents a single symbol, with leading
389-
and trailing whitespace ignored, as is anything following a '#'. Can be
390-
specified multiple times to read names from multiple files.
391-
392418
.. option:: --keep-section <section>
393419

394420
When removing sections from the output, do not remove sections named
@@ -410,19 +436,6 @@ them.
410436

411437
Mark all symbols with hidden or internal visibility local in the output.
412438

413-
.. option:: --localize-symbol <symbol>, -L
414-
415-
Mark any defined non-common symbol named ``<symbol>`` as a local symbol in the
416-
output. Can be specified multiple times to mark multiple symbols as local.
417-
418-
.. option:: --localize-symbols <filename>
419-
420-
Read a list of names from the file ``<filename>`` and mark defined non-common
421-
symbols with those names as local in the output. In the file, each line
422-
represents a single symbol, with leading and trailing whitespace ignored, as is
423-
anything following a '#'. Can be specified multiple times to read names from
424-
multiple files.
425-
426439
.. option:: --new-symbol-visibility <visibility>
427440

428441
Specify the visibility of the symbols automatically created when using binary
@@ -489,19 +502,6 @@ them.
489502
Read a list of symbols from <filename> and change their visibility to the
490503
specified value. Visibility values: default, internal, hidden, protected.
491504

492-
.. option:: --skip-symbol <symbol>
493-
494-
Do not change the parameters of symbol ``<symbol>`` when executing other
495-
options that can change the symbol's name, binding or visibility.
496-
497-
.. option:: --skip-symbols <filename>
498-
499-
Do not change the parameters of symbols named in the file ``<filename>`` when
500-
executing other options that can change the symbol's name, binding or
501-
visibility. In the file, each line represents a single symbol, with leading
502-
and trailing whitespace ignored, as is anything following a '#'.
503-
Can be specified multiple times to read names from multiple files.
504-
505505
.. option:: --split-dwo <dwo-file>
506506

507507
Equivalent to running :program:`llvm-objcopy` with :option:`--extract-dwo` and

llvm/docs/ReleaseNotes.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,12 @@ Changes to the Debug Info
350350
Changes to the LLVM tools
351351
---------------------------------
352352

353+
* llvm-objcopy now supports the following options for Mach-O:
354+
`--globalize-symbol`, `--globalize-symbols`,
355+
`--keep-global-symbol`, `--keep-global-symbols`,
356+
`--localize-symbol`, `--localize-symbols`,
357+
`--skip-symbol`, `--skip-symbols`.
358+
353359
Changes to LLDB
354360
---------------------------------
355361

llvm/lib/ObjCopy/ConfigManager.cpp

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,9 @@ Expected<const COFFConfig &> ConfigManager::getCOFFConfig() const {
3636

3737
Expected<const MachOConfig &> ConfigManager::getMachOConfig() const {
3838
if (!Common.SplitDWO.empty() || !Common.SymbolsPrefix.empty() ||
39-
!Common.SymbolsPrefixRemove.empty() || !Common.SymbolsToSkip.empty() ||
39+
!Common.SymbolsPrefixRemove.empty() ||
4040
!Common.AllocSectionsPrefix.empty() || !Common.KeepSection.empty() ||
41-
!Common.SymbolsToGlobalize.empty() || !Common.SymbolsToKeep.empty() ||
42-
!Common.SymbolsToLocalize.empty() ||
43-
!Common.SymbolsToKeepGlobal.empty() || !Common.SectionsToRename.empty() ||
41+
!Common.SymbolsToKeep.empty() || !Common.SectionsToRename.empty() ||
4442
!Common.UnneededSymbolsToRemove.empty() ||
4543
!Common.SetSectionAlignment.empty() || !Common.SetSectionFlags.empty() ||
4644
!Common.SetSectionType.empty() || Common.ExtractDWO ||

llvm/lib/ObjCopy/MachO/MachOObjcopy.cpp

Lines changed: 27 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -93,19 +93,38 @@ static void markSymbols(const CommonConfig &, Object &Obj) {
9393
static void updateAndRemoveSymbols(const CommonConfig &Config,
9494
const MachOConfig &MachOConfig,
9595
Object &Obj) {
96-
for (SymbolEntry &Sym : Obj.SymTable) {
97-
// Weaken symbols first to match ELFObjcopy behavior.
98-
bool IsExportedAndDefined =
99-
(Sym.n_type & llvm::MachO::N_EXT) &&
100-
(Sym.n_type & llvm::MachO::N_TYPE) != llvm::MachO::N_UNDF;
101-
if (IsExportedAndDefined &&
96+
Obj.SymTable.updateSymbols([&](SymbolEntry &Sym) {
97+
if (Config.SymbolsToSkip.matches(Sym.Name))
98+
return;
99+
100+
if (!Sym.isUndefinedSymbol() && Config.SymbolsToLocalize.matches(Sym.Name))
101+
Sym.n_type &= ~MachO::N_EXT;
102+
103+
// Note: these two globalize flags have very similar names but different
104+
// meanings:
105+
//
106+
// --globalize-symbol: promote a symbol to global
107+
// --keep-global-symbol: all symbols except for these should be made local
108+
//
109+
// If --globalize-symbol is specified for a given symbol, it will be
110+
// global in the output file even if it is not included via
111+
// --keep-global-symbol. Because of that, make sure to check
112+
// --globalize-symbol second.
113+
if (!Sym.isUndefinedSymbol() && !Config.SymbolsToKeepGlobal.empty() &&
114+
!Config.SymbolsToKeepGlobal.matches(Sym.Name))
115+
Sym.n_type &= ~MachO::N_EXT;
116+
117+
if (!Sym.isUndefinedSymbol() && Config.SymbolsToGlobalize.matches(Sym.Name))
118+
Sym.n_type |= MachO::N_EXT;
119+
120+
if (Sym.isExternalSymbol() && !Sym.isUndefinedSymbol() &&
102121
(Config.Weaken || Config.SymbolsToWeaken.matches(Sym.Name)))
103-
Sym.n_desc |= llvm::MachO::N_WEAK_DEF;
122+
Sym.n_desc |= MachO::N_WEAK_DEF;
104123

105124
auto I = Config.SymbolsToRename.find(Sym.Name);
106125
if (I != Config.SymbolsToRename.end())
107126
Sym.Name = std::string(I->getValue());
108-
}
127+
});
109128

110129
auto RemovePred = [&Config, &MachOConfig,
111130
&Obj](const std::unique_ptr<SymbolEntry> &N) {

llvm/lib/ObjCopy/MachO/MachOObject.cpp

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,19 @@ SymbolEntry *SymbolTable::getSymbolByIndex(uint32_t Index) {
3333
static_cast<const SymbolTable *>(this)->getSymbolByIndex(Index));
3434
}
3535

36+
void SymbolTable::updateSymbols(function_ref<void(SymbolEntry &)> Callable) {
37+
for (auto &Sym : Symbols)
38+
Callable(*Sym);
39+
40+
// Partition symbols: local < defined external < undefined external.
41+
auto ExternalBegin = std::stable_partition(
42+
std::begin(Symbols), std::end(Symbols),
43+
[](const auto &Sym) { return Sym->isLocalSymbol(); });
44+
std::stable_partition(ExternalBegin, std::end(Symbols), [](const auto &Sym) {
45+
return !Sym->isUndefinedSymbol();
46+
});
47+
}
48+
3649
void SymbolTable::removeSymbols(
3750
function_ref<bool(const std::unique_ptr<SymbolEntry> &)> ToRemove) {
3851
llvm::erase_if(Symbols, ToRemove);

llvm/lib/ObjCopy/MachO/MachOObject.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,7 @@ struct SymbolTable {
142142

143143
const SymbolEntry *getSymbolByIndex(uint32_t Index) const;
144144
SymbolEntry *getSymbolByIndex(uint32_t Index);
145+
void updateSymbols(function_ref<void(SymbolEntry &)> Callable);
145146
void removeSymbols(
146147
function_ref<bool(const std::unique_ptr<SymbolEntry> &)> ToRemove);
147148
};
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# RUN: yaml2obj %s -o %t
2+
# RUN: llvm-objcopy --wildcard --globalize-symbol="*" %t %t.copy
3+
# RUN: llvm-readobj --symbols %t.copy | FileCheck %s
4+
5+
# RUN: echo "*" > %t-star.txt
6+
# RUN: llvm-objcopy --wildcard --globalize-symbols="%t-star.txt" %t %t.copy
7+
# RUN: llvm-readobj --symbols %t.copy | FileCheck %s
8+
9+
# CHECK: Symbols [
10+
# CHECK-NEXT: Symbol {
11+
# CHECK-NEXT: Name: _PrivateSymbol
12+
# CHECK-NEXT: Extern
13+
# CHECK-NEXT: Type: Section (0xE)
14+
# CHECK-NEXT: Section: __text (0x1)
15+
# CHECK-NEXT: RefType: UndefinedNonLazy (0x0)
16+
# CHECK-NEXT: Flags [ (0x0)
17+
# CHECK-NEXT: ]
18+
# CHECK-NEXT: Value: 0x1
19+
# CHECK-NEXT: }
20+
# CHECK-NEXT: Symbol {
21+
# CHECK-NEXT: Name: _PrivateExternalSymbol
22+
# CHECK-NEXT: PrivateExtern
23+
# CHECK-NEXT: Extern
24+
# CHECK-NEXT: Type: Section (0xE)
25+
# CHECK-NEXT: Section: __text (0x1)
26+
# CHECK-NEXT: RefType: UndefinedNonLazy (0x0)
27+
# CHECK-NEXT: Flags [ (0x0)
28+
# CHECK-NEXT: ]
29+
# CHECK-NEXT: Value: 0x2
30+
# CHECK-NEXT: }
31+
# CHECK-NEXT: Symbol {
32+
# CHECK-NEXT: Name: _CommonSymbol
33+
# CHECK-NEXT: Extern
34+
# CHECK-NEXT: Type: Section (0xE)
35+
# CHECK-NEXT: Section: __text (0x1)
36+
# CHECK-NEXT: RefType: UndefinedNonLazy (0x0)
37+
# CHECK-NEXT: Flags [ (0x0)
38+
# CHECK-NEXT: ]
39+
# CHECK-NEXT: Value: 0x3
40+
# CHECK-NEXT: }
41+
# CHECK-NEXT: Symbol {
42+
# CHECK-NEXT: Name: _UndefinedExternalSymbol
43+
# CHECK-NEXT: Extern
44+
# CHECK-NEXT: Type: Undef (0x0)
45+
# CHECK-NEXT: Section: (0x0)
46+
# CHECK-NEXT: RefType: UndefinedNonLazy (0x0)
47+
# CHECK-NEXT: Flags [ (0x0)
48+
# CHECK-NEXT: ]
49+
# CHECK-NEXT: Value: 0x0
50+
# CHECK-NEXT: }
51+
# CHECK-NEXT: ]
52+
53+
--- !mach-o
54+
FileHeader:
55+
magic: 0xFEEDFACF
56+
cputype: 0x100000C
57+
cpusubtype: 0x0
58+
filetype: 0x2
59+
ncmds: 3
60+
sizeofcmds: 328
61+
flags: 0x200085
62+
reserved: 0x0
63+
LoadCommands:
64+
- cmd: LC_SEGMENT_64
65+
cmdsize: 152
66+
segname: __TEXT
67+
vmaddr: 4294967296
68+
vmsize: 4096
69+
fileoff: 0
70+
filesize: 4096
71+
maxprot: 5
72+
initprot: 5
73+
nsects: 1
74+
flags: 0
75+
Sections:
76+
- sectname: __text
77+
segname: __TEXT
78+
addr: 0x100000FF8
79+
size: 8
80+
offset: 0xFF8
81+
align: 2
82+
reloff: 0x0
83+
nreloc: 0
84+
flags: 0x80000400
85+
reserved1: 0x0
86+
reserved2: 0x0
87+
reserved3: 0x0
88+
content: 00008052C0035FD6
89+
- cmd: LC_SEGMENT_64
90+
cmdsize: 72
91+
segname: __LINKEDIT
92+
vmaddr: 4294971392
93+
vmsize: 4096
94+
fileoff: 4096
95+
filesize: 67
96+
maxprot: 1
97+
initprot: 1
98+
nsects: 0
99+
flags: 0
100+
- cmd: LC_SYMTAB
101+
cmdsize: 24
102+
symoff: 4096
103+
nsyms: 4
104+
stroff: 4164
105+
strsize: 79
106+
LinkEditData:
107+
NameList:
108+
- n_strx: 2
109+
n_type: 0x0E
110+
n_sect: 1
111+
n_desc: 0
112+
n_value: 1
113+
- n_strx: 17
114+
n_type: 0x1E
115+
n_sect: 1
116+
n_desc: 0
117+
n_value: 2
118+
- n_strx: 40
119+
n_type: 0x0F
120+
n_sect: 1
121+
n_desc: 0
122+
n_value: 3
123+
- n_strx: 54
124+
n_type: 0x01
125+
n_sect: 0
126+
n_desc: 0
127+
n_value: 0
128+
StringTable:
129+
- ' '
130+
- _PrivateSymbol
131+
- _PrivateExternalSymbol
132+
- _CommonSymbol
133+
- _UndefinedExternalSymbol
134+
...

0 commit comments

Comments
 (0)