-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lld-macho] Reduce memory usage of printing thunks in map file #122785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld-macho @llvm/pr-subscribers-lld Author: None (alx32) ChangesThis commit improves the memory efficiency of the lld-macho linker by optimizing how thunks are printed in the map file. Previously, merging vectors of input sections required creating a temporary vector, which increased memory usage and in some cases caused the linker to run out of memory. The new approach interleaves the printing of two arrays of ConcatInputSection in sorted order without allocating additional memory for a merged array. Full diff: https://github.com/llvm/llvm-project/pull/122785.diff 2 Files Affected:
diff --git a/lld/MachO/ConcatOutputSection.h b/lld/MachO/ConcatOutputSection.h
index 8131c48d31113e..c2449e82b73ad1 100644
--- a/lld/MachO/ConcatOutputSection.h
+++ b/lld/MachO/ConcatOutputSection.h
@@ -72,7 +72,7 @@ class TextOutputSection : public ConcatOutputSection {
void finalizeContents() override {}
void finalize() override;
bool needsThunks() const;
- ArrayRef<ConcatInputSection *> getThunks() const { return thunks; }
+ const std::vector<ConcatInputSection *> &getThunks() const { return thunks; }
void writeTo(uint8_t *buf) const override;
static bool classof(const OutputSection *sec) {
diff --git a/lld/MachO/MapFile.cpp b/lld/MachO/MapFile.cpp
index 12417df8cecb8c..2aa91af9cc7112 100644
--- a/lld/MachO/MapFile.cpp
+++ b/lld/MachO/MapFile.cpp
@@ -161,20 +161,6 @@ static uint64_t getSymSizeForMap(Defined *sym) {
return sym->size;
}
-// Merges two vectors of input sections in order of their outSecOff values.
-// This approach creates a new (temporary) vector which is not ideal but the
-// ideal approach leads to a lot of code duplication.
-static std::vector<ConcatInputSection *>
-mergeOrderedInputs(ArrayRef<ConcatInputSection *> inputs1,
- ArrayRef<ConcatInputSection *> inputs2) {
- std::vector<ConcatInputSection *> vec(inputs1.size() + inputs2.size());
- std::merge(inputs1.begin(), inputs1.end(), inputs2.begin(), inputs2.end(),
- vec.begin(), [](ConcatInputSection *a, ConcatInputSection *b) {
- return a->outSecOff < b->outSecOff;
- });
- return vec;
-}
-
void macho::writeMapFile() {
if (config->mapFile.empty())
return;
@@ -217,15 +203,42 @@ void macho::writeMapFile() {
seg->name.str().c_str(), osec->name.str().c_str());
}
- // Shared function to print an array of symbols.
- auto printIsecArrSyms = [&](const std::vector<ConcatInputSection *> &arr) {
- for (const ConcatInputSection *isec : arr) {
+ // Shared function to print one or two arrays of ConcatInputSection in
+ // ascending outSecOff order. The second array is optional; if provided, we
+ // interleave the printing in sorted order without allocating a merged temp
+ // array.
+ auto printIsecArrSyms = [&](const std::vector<ConcatInputSection *> &arr1,
+ const std::vector<ConcatInputSection *> *arr2 =
+ nullptr) {
+ // Helper lambda that prints all symbols from one ConcatInputSection.
+ auto printOne = [&](const ConcatInputSection *isec) {
for (Defined *sym : isec->symbols) {
- if (!(isPrivateLabel(sym->getName()) && getSymSizeForMap(sym) == 0))
+ if (!(isPrivateLabel(sym->getName()) && getSymSizeForMap(sym) == 0)) {
os << format("0x%08llX\t0x%08llX\t[%3u] %s\n", sym->getVA(),
getSymSizeForMap(sym),
- readerToFileOrdinal[sym->getFile()],
+ readerToFileOrdinal.lookup(sym->getFile()),
sym->getName().str().data());
+ }
+ }
+ };
+
+ // If there is only one array, print all symbols from it and return.
+ // This simplifies the logic for the merge case below.
+ if (!arr2) {
+ for (const ConcatInputSection *isec : arr1)
+ printOne(isec);
+ return;
+ }
+
+ size_t i = 0, j = 0;
+ size_t size1 = arr1.size();
+ size_t size2 = arr2->size();
+ while (i < size1 || j < size2) {
+ if (i < size1 &&
+ (j >= size2 || arr1[i]->outSecOff <= (*arr2)[j]->outSecOff)) {
+ printOne(arr1[i++]);
+ } else if (j < size2) {
+ printOne((*arr2)[j++]);
}
}
};
@@ -235,9 +248,7 @@ void macho::writeMapFile() {
for (const OutputSegment *seg : outputSegments) {
for (const OutputSection *osec : seg->getSections()) {
if (auto *textOsec = dyn_cast<TextOutputSection>(osec)) {
- auto inputsAndThunks =
- mergeOrderedInputs(textOsec->inputs, textOsec->getThunks());
- printIsecArrSyms(inputsAndThunks);
+ printIsecArrSyms(textOsec->inputs, &textOsec->getThunks());
} else if (auto *concatOsec = dyn_cast<ConcatOutputSection>(osec)) {
printIsecArrSyms(concatOsec->inputs);
} else if (osec == in.cStringSection || osec == in.objcMethnameSection) {
|
I don't see any changes in memory usage with this PR. Were you able to see a difference when running: |
I ran those commands from your previous comment and didn't see memory changes and also got results very different than your previous results - so assumed that it's platform / Linux distro specific and that this feature doesn't work for my setup. This does eliminate the extra temporary array so there is less memory being used for sure. |
These are the CMake arguments I used to build LLVM if it helps: |
Latest commit should address the other (main) reason for high mem usage - the |
lld/MachO/MapFile.cpp
Outdated
size_t size2 = arr2->size(); | ||
while (i < size1 || j < size2) { | ||
if (i < size1 && | ||
(j >= size2 || arr1[i]->outSecOff <= (*arr2)[j]->outSecOff)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the general logic, but (*arr2)[j]
is not so please to look at, why not ArrayRef
? IIRC you can have ArrayRef
pointing to nullptr
s and it would be prettier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, changing it to ArrayRef
simplified the code / logic a lot.
adb4e0d
to
8f4b2cd
Compare
@alx32 Thank you! This is a massive improvement compared even to before your original change:
Before 162814a:
After 162814a:
This PR:
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/11172 Here is the relevant piece of the build log for the reference
|
This commit improves the memory efficiency of the lld-macho linker by optimizing how thunks are printed in the map file. Previously, merging vectors of input sections required creating a temporary vector, which increased memory usage and in some cases caused the linker to run out of memory as reported in comments on #120496. The new approach interleaves the printing of two arrays of ConcatInputSection in sorted order without allocating additional memory for a merged array.