Skip to content

[SymbolGraph] Make symbol-graph output deterministic #80648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 9, 2025

Conversation

cachemeifyoucan
Copy link
Contributor

@cachemeifyoucan cachemeifyoucan commented Apr 8, 2025

SymbolGraph generation iterating over llvm::DenseSet, which makes the symbols and relationship fields appear in non-deterministic ordering. Switch to use llvm::SetVector to preserve the insertion order from SourceEntityWalker to make order deterministic.

Resolves: #59602

@cachemeifyoucan
Copy link
Contributor Author

@swift-ci please smoke test

Comment on lines 665 to 668
SmallString<256> LHSTargetUSR, RHSTargetUSR;
LHS.Target.getUSR(LHSTargetUSR);
RHS.Target.getUSR(RHSTargetUSR);
return LHSTargetUSR < RHSTargetUSR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is code you didn't touch, but i feel like this should order on more than target USR - there can be several relationships that target the same symbol which are still going to be in a nondeterminstic ordering after this sort is applied.

Personally i would define a < operator for Edge which compares target, source, and kind, and defer to that in this sort method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Is USR good enough for Symbol entry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically no. Imported Objective-C symbols that have an automatically-converted version (e.g. a completion handler being converted into an async method) will appear in symbol graphs as two symbols with the same USR. I'm not sure how to fetch that info offhand, but being able to match those would be useful as well.

In SymbolKit we would just defer to the declaration fragments, but we don't store those outside the output JSON structure, so i'm not sure what could be used as an ordering constraint in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is easier if just convert DenseSet to llvm::SetVector? It uses more memory but assuming the ASTWalker is deterministic, then preserving the order of insertion should make symbol graph deterministic too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if ASTWalker (SourceEntityWalker in our case) is deterministic, but if it is then that should be perfect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A regular ASTWalker should be, unless some function overlayed on top is breaking this, then it is a different problem to fix. I run it in a loop for a while and so far nothing is non-deterministic. Let me update to use that.

@QuietMisdreavus
Copy link
Contributor

Thanks so much for opening this PR! This is something that's been in my awareness for a long time. I've previously filed this issue as #59602 and rdar://95598551, for reference. Can you update the relationships sorting method like i commented?

@cachemeifyoucan
Copy link
Contributor Author

Thanks so much for opening this PR! This is something that's been in my awareness for a long time. I've previously filed this issue as #59602 and rdar://95598551, for reference. Can you update the relationships sorting method like i commented?

Ah, nice. I am not aware of that. Let me update to use your issue and radar instead.

SymbolGraph generation iterating over llvm::DenseSet, which makes the
symbols and relationship fields appear in non-deterministic ordering.
Switch to use llvm::SetVector to preserve the insertion order from
SourceEntityWalker to make order deterministic.

Resolves: swiftlang#59602
@cachemeifyoucan
Copy link
Contributor Author

@swift-ci please smoke test

Copy link
Contributor

@QuietMisdreavus QuietMisdreavus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks so much!

@cachemeifyoucan
Copy link
Contributor Author

@ahoppen Since I am removing your sorting function, does the order of edges matter to you or it just need to be deterministic?

@ahoppen
Copy link
Member

ahoppen commented Apr 9, 2025

The only need for the sorting was that we get consistent output to compare symbol graphs. If we get deterministic output by other means now, the sorting shouldn’t be needed anymore.

@cachemeifyoucan cachemeifyoucan merged commit a3ed3c0 into swiftlang:main Apr 9, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Symbols and relationships in a symbol graph are non-deterministically ordered
3 participants