Skip to content

[lld-macho] Implement symbol string deduplication #123874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lld/MachO/Config.h
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ struct Configuration {
bool timeTraceEnabled = false;
bool dataConst = false;
bool dedupStrings = true;
bool dedupSymbolStrings = true;
bool deadStripDuplicates = false;
bool omitDebugInfo = false;
bool warnDylibInstallName = false;
Expand Down
1 change: 1 addition & 0 deletions lld/MachO/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1806,6 +1806,7 @@ bool link(ArrayRef<const char *> argsArr, llvm::raw_ostream &stdoutOS,
config->keepICFStabs = args.hasArg(OPT_keep_icf_stabs);
config->dedupStrings =
args.hasFlag(OPT_deduplicate_strings, OPT_no_deduplicate_strings, true);
config->dedupSymbolStrings = !args.hasArg(OPT_no_deduplicate_symbol_strings);
config->deadStripDuplicates = args.hasArg(OPT_dead_strip_duplicates);
config->warnDylibInstallName = args.hasFlag(
OPT_warn_dylib_install_name, OPT_no_warn_dylib_install_name, false);
Expand Down
5 changes: 5 additions & 0 deletions lld/MachO/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -1476,3 +1476,8 @@ def no_warn_duplicate_libraries : Flag<["-"], "no_warn_duplicate_libraries">,
HelpText<"Do not warn if the input contains duplicate library options.">,
Flags<[HelpHidden]>,
Group<grp_ignored_silently>;

// Add this with the other flags in the rare options group
def no_deduplicate_symbol_strings : Flag<["-"], "no-deduplicate-symbol-strings">,
HelpText<"Do not deduplicate strings in the symbol string table. Might result in larger binaries but slightly faster link times.">,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to also note that "no_dedup..." takes higher priority if both flags are present?

Copy link
Contributor Author

@alx32 alx32 Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which flags are you referring to ? This is enabled by default with only a flag to disable it - there is no flag to enable. Similar to -no_deduplicate in apple linker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh NVM, I thought OPT_deduplicate_strings needs to be explicitly turned on.

Group<grp_rare>;
9 changes: 8 additions & 1 deletion lld/MachO/SyntheticSections.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1541,7 +1541,14 @@ StringTableSection::StringTableSection()

uint32_t StringTableSection::addString(StringRef str) {
uint32_t strx = size;
strings.push_back(str); // TODO: consider deduplicating strings
if (config->dedupSymbolStrings) {
llvm::CachedHashStringRef hashedStr(str);
auto [it, inserted] = stringMap.try_emplace(hashedStr, strx);
if (!inserted)
return it->second;
}

strings.push_back(str);
size += str.size() + 1; // account for null terminator
return strx;
}
Expand Down
1 change: 1 addition & 0 deletions lld/MachO/SyntheticSections.h
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,7 @@ class StringTableSection final : public LinkEditSection {
// match its behavior here since some tools depend on it.
// Consequently, the empty string will be at index 1, not zero.
std::vector<StringRef> strings{" "};
llvm::DenseMap<llvm::CachedHashStringRef, uint32_t> stringMap;
size_t size = 2;
};

Expand Down
11 changes: 11 additions & 0 deletions lld/test/MachO/cfstring-dedup.s
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,17 @@
# RUN: %lld -dylib -framework CoreFoundation %t/foo1.o %t/foo2.o -o %t/foo
# RUN: llvm-objdump --no-print-imm-hex --macho --rebase --bind --syms -d %t/foo | FileCheck %s --check-prefix=LITERALS

# Check that string deduplication for symbol names is working
# RUN: %lld -dylib -framework CoreFoundation %t/foo1.o %t/foo2.o -o %t/foo_no_dedup -no-deduplicate-symbol-strings
# RUN: llvm-strings %t/foo | FileCheck %s --check-prefix=CHECK-DEDUP
# RUN: llvm-strings %t/foo_no_dedup | FileCheck %s --check-prefix=CHECK-NO-DEDUP
# CHECK-DEDUP: _named_cfstring
# CHECK-DEDUP-NOT: _named_cfstring
# CHECK-NO-DEDUP: _named_cfstring
# CHECK-NO-DEDUP: _named_cfstring
# CHECK-NO-DEDUP-NOT: _named_cfstring


# CHECK: (__TEXT,__text) section
# CHECK-NEXT: _foo1:
# CHECK-NEXT: _foo2:
Expand Down
Loading