Skip to content

[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF #106193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 8, 2024
1 change: 1 addition & 0 deletions lld/ELF/LTO.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ static lto::Config createConfig() {
config->ltoValidateAllVtablesHaveTypeInfos;
c.AllVtablesHaveTypeInfos = ctx.ltoAllVtablesHaveTypeInfos;
c.AlwaysEmitRegularLTOObj = !config->ltoObjPath.empty();
c.KeepSymbolNameCopies = false;

for (const llvm::StringRef &name : config->thinLTOModulesToCompile)
c.ThinLTOModulesToCompile.emplace_back(name);
Expand Down
5 changes: 5 additions & 0 deletions llvm/include/llvm/LTO/Config.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,11 @@ struct Config {
/// want to know a priori all possible output files.
bool AlwaysEmitRegularLTOObj = false;

/// If true, the LTO instance creates copies of the symbol names for LTO::run.
/// The lld linker uses string saver to keep symbol names alive and doesn't
/// need to create copies, so it can set this field to false.
bool KeepSymbolNameCopies = true;

/// Allows non-imported definitions to get the potentially more constraining
/// visibility from the prevailing definition. FromPrevailing is the default
/// because it works for many binary formats. ELF can use the more optimized
Expand Down
14 changes: 12 additions & 2 deletions llvm/include/llvm/LTO/LTO.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
#ifndef LLVM_LTO_LTO_H
#define LLVM_LTO_LTO_H

#include <memory>

#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/StringMap.h"
#include "llvm/Bitcode/BitcodeReader.h"
Expand All @@ -23,6 +26,7 @@
#include "llvm/Object/IRSymtab.h"
#include "llvm/Support/Caching.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/StringSaver.h"
#include "llvm/Support/thread.h"
#include "llvm/Transforms/IPO/FunctionAttrs.h"
#include "llvm/Transforms/IPO/FunctionImport.h"
Expand Down Expand Up @@ -348,6 +352,11 @@ class LTO {
DenseMap<GlobalValue::GUID, StringRef> PrevailingModuleForGUID;
} ThinLTO;

std::unique_ptr<llvm::BumpPtrAllocator> Alloc;

// Symbol saver for global resolution map.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to group these closer to GlobalResolutions map? The comment helps refer to it, but right now is separated by the "struct GlobalResolution" definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

std::unique_ptr<llvm::StringSaver> GlobalResolutionSymbolSaver;

// The global resolution for a particular (mangled) symbol name. This is in
// particular necessary to track whether each symbol can be internalized.
// Because any input file may introduce a new cross-partition reference, we
Expand Down Expand Up @@ -406,9 +415,10 @@ class LTO {
};

// Global mapping from mangled symbol names to resolutions.
// Make this an optional to guard against accessing after it has been reset
// Make this an unique_ptr to guard against accessing after it has been reset
// (to reduce memory after we're done with it).
std::optional<StringMap<GlobalResolution>> GlobalResolutions;
std::unique_ptr<llvm::DenseMap<StringRef, GlobalResolution>>
GlobalResolutions;

void addModuleToGlobalRes(ArrayRef<InputFile::Symbol> Syms,
ArrayRef<SymbolResolution> Res, unsigned Partition,
Expand Down
24 changes: 21 additions & 3 deletions llvm/lib/LTO/LTO.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,10 @@ cl::opt<bool> EnableLTOInternalization(
"enable-lto-internalization", cl::init(true), cl::Hidden,
cl::desc("Enable global value internalization in LTO"));

static cl::opt<bool>
LTOKeepSymbolCopies("lto-keep-symbol-copies", cl::init(false), cl::Hidden,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is an option needed? I think this can just be set through the config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a back-up option in case this change hit any corner case (missed a symbol) but not caught in release. Though all symbols used by GlobalResolutions should be covered by looking at the code.

cl::desc("Keep copies of symbols in LTO indexing"));

/// Indicate we are linking with an allocator that supports hot/cold operator
/// new interfaces.
extern cl::opt<bool> SupportsHotColdNew;
Expand Down Expand Up @@ -607,8 +611,14 @@ LTO::LTO(Config Conf, ThinBackend Backend,
: Conf(std::move(Conf)),
RegularLTO(ParallelCodeGenParallelismLevel, this->Conf),
ThinLTO(std::move(Backend)),
GlobalResolutions(std::make_optional<StringMap<GlobalResolution>>()),
LTOMode(LTOMode) {}
GlobalResolutions(
std::make_unique<DenseMap<StringRef, GlobalResolution>>()),
LTOMode(LTOMode) {
if (Conf.KeepSymbolNameCopies || LTOKeepSymbolCopies) {
Alloc = std::make_unique<BumpPtrAllocator>();
GlobalResolutionSymbolSaver = std::make_unique<llvm::StringSaver>(*Alloc);
}
}

// Requires a destructor for MapVector<BitcodeModule>.
LTO::~LTO() = default;
Expand All @@ -626,7 +636,12 @@ void LTO::addModuleToGlobalRes(ArrayRef<InputFile::Symbol> Syms,
assert(ResI != ResE);
SymbolResolution Res = *ResI++;

auto &GlobalRes = (*GlobalResolutions)[Sym.getName()];
StringRef SymbolName = Sym.getName();
// Keep copies of symbols if the client of LTO says so.
if (GlobalResolutionSymbolSaver && !GlobalResolutions->contains(SymbolName))
SymbolName = GlobalResolutionSymbolSaver->save(SymbolName);

auto &GlobalRes = (*GlobalResolutions)[SymbolName];
Comment on lines +619 to +624
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A somewhat nit picky suggestion, but should we create a wrapper class for llvm::DenseMap<StringRef, GlobalResolution>? This way, we can intercept its operator()[] and put all the deduplication logic there.

This way, the application logic should stay clear. That is:

auto &GlobalRes = (*GlobalResolutions)[Sym.getName()];

staying as is.

We should be able to put members like:

  std::unique_ptr<llvm::BumpPtrAllocator> Alloc;

  // Symbol saver for global resolution map.
  std::unique_ptr<llvm::StringSaver> GlobalResolutionSymbolSaver;

into the wrapper class and probably strip away std::unique_ptr from them, which in turn should let you remove:

  GlobalResolutionSymbolSaver.reset();
  Alloc.reset();

below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a helper method LTO::releaseGlobalResolutionsMemory which releases the memory used by GlobalResolutions together. I'm not sure if having a wrapper class has the correct ROI for code simplicity here as reader also need to parse the added logic of wrapper class.

GlobalRes.UnnamedAddr &= Sym.isUnnamedAddr();
if (Res.Prevailing) {
assert(!GlobalRes.Prevailing &&
Expand Down Expand Up @@ -1797,6 +1812,9 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
// cross module importing, which adds to peak memory via the computed import
// and export lists.
GlobalResolutions.reset();
// Release the string saver memory.
GlobalResolutionSymbolSaver.reset();
Alloc.reset();

if (Conf.OptLevel > 0)
ComputeCrossModuleImport(ThinLTO.CombinedIndex, ModuleToDefinedGVSummaries,
Expand Down
Loading