Skip to content

[lld-macho] Category merger: handle addends when getting symbol at offset #91238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 26 additions & 3 deletions lld/MachO/ObjC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,7 @@ class ObjcCategoryMerger {
Defined *emitCategoryName(const std::string &name, ObjFile *objFile);
void createSymbolReference(Defined *refFrom, const Symbol *refTo,
uint32_t offset, const Reloc &relocTemplate);
Defined *tryFindDefinedOnIsec(const InputSection *isec, uint32_t offset);
Symbol *tryGetSymbolAtIsecOffset(const ConcatInputSection *isec,
uint32_t offset);
Defined *tryGetDefinedAtIsecOffset(const ConcatInputSection *isec,
Expand Down Expand Up @@ -566,7 +567,25 @@ ObjcCategoryMerger::tryGetSymbolAtIsecOffset(const ConcatInputSection *isec,
if (!reloc)
return nullptr;

return reloc->referent.get<Symbol *>();
Symbol *sym = reloc->referent.get<Symbol *>();

if (reloc->addend) {
assert(isa<Defined>(sym) && "Expected defined for non-zero addend");
Defined *definedSym = cast<Defined>(sym);
sym = tryFindDefinedOnIsec(definedSym->isec(),
definedSym->value + reloc->addend);
}

return sym;
}

Defined *ObjcCategoryMerger::tryFindDefinedOnIsec(const InputSection *isec,
uint32_t offset) {
for (Defined *sym : isec->symbols)
if ((sym->value <= offset) && (sym->value + sym->size > offset))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we have an offset in the middle of symbol.

Copy link
Contributor Author

@alx32 alx32 May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not fully debug it but it does happen in very rare cases. I don't think it violates any rule, the reloc can refer to within a symbol, not necessarily to the start of one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to tighten the condition as possible for the expected case or value, instead of loosely accepting a valid case.

Copy link
Contributor Author

@alx32 alx32 May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand. I can't think of a narrower case that would actually still work - we do need to find a symbol by a reloc into the middle of it (or any arbitrary position).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I'd like to understand why we have an offset in the middle of symbol. If it's unexpected, we should fix the root-cause. If expected, we should document it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will look into the cases where this happens and come back to this change later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the test, the addend comes from:

__CATEGORY__TtC11SimpleClass11SimpleClass_$_SimpleClass:
	[...]
	.quad	_$s11SimpleClassAACMf+24    // <---- Addend = 24

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function here seems similar to this. Perhaps, implementation could be shared or refactored?

Copy link
Contributor Author

@alx32 alx32 Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they might be different enough to remain separate because:

  • In InputFiles it requires the offset to be to the start of the symbol:
// The offset should point at the exact address of a symbol (with no addend.)

Here the offset can be to the inside of the symbol - see here

.quad	__DATA__TtC11SimpleClass11SimpleClass+2

We could remove this requirement and share them - but would have to remove the assert from the version in InputFiles.

  • In InputFiles it relies on the symbols being ordered - which is not necessarily the case inside optimization passes. So here we have to do a linear scan. I don't think in practice this would be a restriction as the stage of this pass I don't think we run anything that reorders the symbols for these sections - but it's not a hard requirement to have them sorted as far as I know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the comment says that but the implementation seems to handle it for the case when it doesn't. Okay with the linear scan, but it would be a nice to make it faster if we know it's already sorted (probably worth checking). If it is sorted, it's worth adding an assert in that case as well to make sure we do do not invalidate correctness behavior.

return sym;

return nullptr;
}

Defined *
Expand Down Expand Up @@ -1288,8 +1307,12 @@ void ObjcCategoryMerger::eraseMergedCategories() {
continue;

eraseISec(catInfo.catBodyIsec);

tryEraseDefinedAtIsecOffset(catInfo.catBodyIsec, catLayout.nameOffset);
// We can't erase 'catLayout.nameOffset' for Swift categories because the
// name will be referenced for generating relative offsets
// See usages of 'l_.str.11.SimpleClass' in objc-category-merging-swift.s
// TODO: handle the above in a smarter way
if (catInfo.sourceLanguage != SourceLanguage::Swift)
tryEraseDefinedAtIsecOffset(catInfo.catBodyIsec, catLayout.nameOffset);
tryEraseDefinedAtIsecOffset(catInfo.catBodyIsec,
catLayout.instanceMethodsOffset);
tryEraseDefinedAtIsecOffset(catInfo.catBodyIsec,
Expand Down
Loading
Loading