-
Notifications
You must be signed in to change notification settings - Fork 10.5k
GSB: New algorithm for identifying redundant non-same-type requirements #36109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
slavapestov
merged 2 commits into
swiftlang:main
from
slavapestov:scc-redundant-requirements-algorithm
Mar 2, 2021
Merged
GSB: New algorithm for identifying redundant non-same-type requirements #36109
slavapestov
merged 2 commits into
swiftlang:main
from
slavapestov:scc-redundant-requirements-algorithm
Mar 2, 2021
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5809f69
to
0d90f50
Compare
Thank you to @ornata for suggesting the idea of using the strongly-connected-components DAG to compute the set of minimized requirements. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Generic signature minimization needs to diagnose and remove any redundant requirements, that is, requirements that can be proven from some subset of the remaining requirements. For each requirement on an equivalence class, we record a set of RequirementSources; a RequirementSource is a "proof" that the equivalence class satisfies the requirement. A RequirementSource is either "explicit" -- meaning it corresponds to a generic requirement written by the user -- or it is "derived", meaning it can be proven from some other explicit requirement. The most naive formulation of the minimization problem is that we say that an explicit requirement is redundant if there is a derived source for the same requirement. However, this is not sufficient, for example: protocol P { associatedtype A : P where A.A == Self } In the signature <T where T : P>, the explicit requirement T : P also has a derived source T.A.A : P. However, this source is "self-derived", meaning that in order to obtain the witness table for T.A.A : P, we first have to obtain the witness table for T.A. The GenericSignatureBuilder handled this kind of 'self-derived' requirement correctly, by removing it from the list of sources. This was implemented in the removeSelfDerived() function. After removeSelfDerived() was called, any remaining derived requirement sources were assumed to obsolete any explicit source for the same requirement. However, even this was not sufficient -- namely, it only handled the case where a explicit requirement would imply a derived source for itself, and not a cycle involving multiple explicit sources that would imply each other. For example, the following generic signature would be misdiagnosed with *both* conformance requirements as redundant, resulting in an invalid generic signature: protocol P { associatedtype T : P } func f<T : P, U : P>(_: T, _: U) where T.X == U, U.X == T {} In the above example, T : P has an explicit requirement source, as well as a derived source (U : P)(U.X : P). Similarly, U : P has an explicit requirement source, as well as a derived source (T : P)(T.X : P). Since neither of the derived sources were "self-derived" according to our definition, we would diagnose *both* explicit sources as redundant. But of course, after dropping them, we are left with the following generic signature: func f<T, U>(_: T, _: U) where T.X == U, U.X == T {} This is no longer valid -- since neither T nor U have a conformance requirement, the nested types T.X and U.X referenced from our same-type requirements are no longer valid. The new algorithm abandons the "self-derived" concept. Instead, we build a directed graph where the vertices are explicit requirements, and the edges are implications where one explicit requirement implies another. In the above example, each of the explicit conformance requirements implies the other. This means a correct minimization must pick exactly one of the two -- not zero, and not both. The set of minimized requirements is formally defined as the minimum set of requirements whose transitive closure is the entire graph. We compute this set by first building the graph of strongly connected components using Tarjan's algorithm. The graph of SCCs is a directed acyclic graph, which means we can compute the root set of the DAG. Finally, we pick a suitable representative requirement from each SCC in the root set, using the lexshort order on subject types. This commit implements the new algorithm, runs it on each generic signature and asserts some properties about the results, but doesn't actually use the algorithm for computing the minimized signature or diagnosing redundancies; that will come in the next commit.
ca473dc
to
f30af84
Compare
Source compatibility already passed and the updated PR only adds some tests. |
f30af84
to
0be55c1
Compare
@swift-ci Please smoke test |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Generic signature minimization needs to diagnose and remove any redundant requirements, that is, requirements that can be proven from some subset of the remaining requirements.
For each requirement on an equivalence class, we record a set of RequirementSources; a RequirementSource is a "proof" that the equivalence class satisfies the requirement.
A RequirementSource is either "explicit" -- meaning it corresponds to a generic requirement written by the user -- or it is "derived", meaning it can be proven from some other explicit requirement.
The most naive formulation of the minimization problem is that we say that an explicit requirement is redundant if there is a derived source for the same requirement. However, this is not sufficient, for example:
In the signature , the explicit requirement T : P also has a derived source T.A.A : P. However, this source is "self-derived", meaning that in order to obtain the witness table for T.A.A : P, we first have to obtain the witness table for T.A.
The GenericSignatureBuilder handled this kind of 'self-derived' requirement correctly, by removing it from the list of sources. This was implemented in the removeSelfDerived() function.
After removeSelfDerived() was called, any remaining derived requirement sources were assumed to obsolete any explicit source for the same requirement.
However, even this was not sufficient -- namely, it only handled the case where a explicit requirement would imply a derived source for itself, and not a cycle involving multiple explicit sources that would imply each other.
For example, the following generic signature would be misdiagnosed with both conformance requirements as redundant, resulting in an invalid generic signature:
In the above example, T : P has an explicit requirement source, as well as a derived source (U : P)(U.X : P). Similarly, U : P has an explicit requirement source, as well as a derived source (T : P)(T.X : P). Since neither of the derived sources were "self-derived" according to our definition, we would diagnose both explicit sources as redundant. But of course, after dropping them, we are left with the following generic signature:
This is no longer valid -- since neither T nor U have a conformance requirement, the nested types T.X and U.X referenced from our same-type requirements are no longer valid.
The new algorithm abandons the "self-derived" concept. Instead, we build a directed graph where the vertices are explicit requirements, and the edges are implications where one explicit requirement implies another. In the above example, each of the explicit conformance requirements implies the other. This means a correct minimization must pick exactly one of the two -- not zero, and not both.
The set of minimized requirements is formally defined as the minimum set of requirements whose transitive closure is the entire graph.
We compute this set by first building the graph of strongly connected components using Tarjan's algorithm. The graph of SCCs is a directed acyclic graph, which means we can compute the root set of the DAG. Finally, we pick a suitable representative requirement from each SCC in the root set, using the lexshort order on subject types.
Fixes rdar://problem/rdar62903491.
Next steps
This PR drops in the new algorithm and uses it in a couple of places without refactoring much else. The old removeSelfDerived() code is still part of some load-bearing hacks in computing conformance access paths and minimizing same-type requirements. There will be more code deleted in this area eventually.