GSB: New algorithm for identifying redundant non-same-type requirements #36109

slavapestov · 2021-02-23T21:31:02Z

Generic signature minimization needs to diagnose and remove any redundant requirements, that is, requirements that can be proven from some subset of the remaining requirements.

For each requirement on an equivalence class, we record a set of RequirementSources; a RequirementSource is a "proof" that the equivalence class satisfies the requirement.

A RequirementSource is either "explicit" -- meaning it corresponds to a generic requirement written by the user -- or it is "derived", meaning it can be proven from some other explicit requirement.

The most naive formulation of the minimization problem is that we say that an explicit requirement is redundant if there is a derived source for the same requirement. However, this is not sufficient, for example:

protocol P {
  associatedtype A : P where A.A == Self
}

In the signature , the explicit requirement T : P also has a derived source T.A.A : P. However, this source is "self-derived", meaning that in order to obtain the witness table for T.A.A : P, we first have to obtain the witness table for T.A.

The GenericSignatureBuilder handled this kind of 'self-derived' requirement correctly, by removing it from the list of sources. This was implemented in the removeSelfDerived() function.

After removeSelfDerived() was called, any remaining derived requirement sources were assumed to obsolete any explicit source for the same requirement.

However, even this was not sufficient -- namely, it only handled the case where a explicit requirement would imply a derived source for itself, and not a cycle involving multiple explicit sources that would imply each other.

For example, the following generic signature would be misdiagnosed with both conformance requirements as redundant, resulting in an invalid generic signature:

protocol P {
  associatedtype T : P
}

func f<T : P, U : P>(_: T, _: U) where T.X == U, U.X == T {}

In the above example, T : P has an explicit requirement source, as well as a derived source (U : P)(U.X : P). Similarly, U : P has an explicit requirement source, as well as a derived source (T : P)(T.X : P). Since neither of the derived sources were "self-derived" according to our definition, we would diagnose both explicit sources as redundant. But of course, after dropping them, we are left with the following generic signature:

func f<T, U>(_: T, _: U) where T.X == U, U.X == T {}

This is no longer valid -- since neither T nor U have a conformance requirement, the nested types T.X and U.X referenced from our same-type requirements are no longer valid.

The new algorithm abandons the "self-derived" concept. Instead, we build a directed graph where the vertices are explicit requirements, and the edges are implications where one explicit requirement implies another. In the above example, each of the explicit conformance requirements implies the other. This means a correct minimization must pick exactly one of the two -- not zero, and not both.

The set of minimized requirements is formally defined as the minimum set of requirements whose transitive closure is the entire graph.

We compute this set by first building the graph of strongly connected components using Tarjan's algorithm. The graph of SCCs is a directed acyclic graph, which means we can compute the root set of the DAG. Finally, we pick a suitable representative requirement from each SCC in the root set, using the lexshort order on subject types.

Fixes rdar://problem/rdar62903491.

Next steps

This PR drops in the new algorithm and uses it in a couple of places without refactoring much else. The old removeSelfDerived() code is still part of some load-bearing hacks in computing conformance access paths and minimizing same-type requirements. There will be more code deleted in this area eventually.

slavapestov · 2021-03-01T05:18:17Z

Thank you to @ornata for suggesting the idea of using the strongly-connected-components DAG to compute the set of minimized requirements.

Generic signature minimization needs to diagnose and remove any redundant requirements, that is, requirements that can be proven from some subset of the remaining requirements. For each requirement on an equivalence class, we record a set of RequirementSources; a RequirementSource is a "proof" that the equivalence class satisfies the requirement. A RequirementSource is either "explicit" -- meaning it corresponds to a generic requirement written by the user -- or it is "derived", meaning it can be proven from some other explicit requirement. The most naive formulation of the minimization problem is that we say that an explicit requirement is redundant if there is a derived source for the same requirement. However, this is not sufficient, for example: protocol P { associatedtype A : P where A.A == Self } In the signature <T where T : P>, the explicit requirement T : P also has a derived source T.A.A : P. However, this source is "self-derived", meaning that in order to obtain the witness table for T.A.A : P, we first have to obtain the witness table for T.A. The GenericSignatureBuilder handled this kind of 'self-derived' requirement correctly, by removing it from the list of sources. This was implemented in the removeSelfDerived() function. After removeSelfDerived() was called, any remaining derived requirement sources were assumed to obsolete any explicit source for the same requirement. However, even this was not sufficient -- namely, it only handled the case where a explicit requirement would imply a derived source for itself, and not a cycle involving multiple explicit sources that would imply each other. For example, the following generic signature would be misdiagnosed with *both* conformance requirements as redundant, resulting in an invalid generic signature: protocol P { associatedtype T : P } func f<T : P, U : P>(_: T, _: U) where T.X == U, U.X == T {} In the above example, T : P has an explicit requirement source, as well as a derived source (U : P)(U.X : P). Similarly, U : P has an explicit requirement source, as well as a derived source (T : P)(T.X : P). Since neither of the derived sources were "self-derived" according to our definition, we would diagnose *both* explicit sources as redundant. But of course, after dropping them, we are left with the following generic signature: func f<T, U>(_: T, _: U) where T.X == U, U.X == T {} This is no longer valid -- since neither T nor U have a conformance requirement, the nested types T.X and U.X referenced from our same-type requirements are no longer valid. The new algorithm abandons the "self-derived" concept. Instead, we build a directed graph where the vertices are explicit requirements, and the edges are implications where one explicit requirement implies another. In the above example, each of the explicit conformance requirements implies the other. This means a correct minimization must pick exactly one of the two -- not zero, and not both. The set of minimized requirements is formally defined as the minimum set of requirements whose transitive closure is the entire graph. We compute this set by first building the graph of strongly connected components using Tarjan's algorithm. The graph of SCCs is a directed acyclic graph, which means we can compute the root set of the DAG. Finally, we pick a suitable representative requirement from each SCC in the root set, using the lexshort order on subject types. This commit implements the new algorithm, runs it on each generic signature and asserts some properties about the results, but doesn't actually use the algorithm for computing the minimized signature or diagnosing redundancies; that will come in the next commit.

slavapestov · 2021-03-01T22:27:07Z

Source compatibility already passed and the updated PR only adds some tests.

slavapestov · 2021-03-02T00:16:40Z

@swift-ci Please smoke test

slavapestov force-pushed the scc-redundant-requirements-algorithm branch 8 times, most recently from 5809f69 to 0d90f50 Compare March 1, 2021 05:13

slavapestov changed the title ~~Scc redundant requirements algorithm~~ GSB: New algorithm for identifying redundant requirements Mar 1, 2021

slavapestov marked this pull request as ready for review March 1, 2021 05:14

This comment has been minimized.

Sign in to view

slavapestov force-pushed the scc-redundant-requirements-algorithm branch 3 times, most recently from ca473dc to f30af84 Compare March 1, 2021 22:26

slavapestov changed the title ~~GSB: New algorithm for identifying redundant requirements~~ GSB: New algorithm for identifying redundant non-same-type requirements Mar 1, 2021

GSB: Use redundant requirement graph in enumerateRequirements()

0be55c1

slavapestov force-pushed the scc-redundant-requirements-algorithm branch from f30af84 to 0be55c1 Compare March 1, 2021 22:50

slavapestov merged commit 85bbf70 into swiftlang:main Mar 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GSB: New algorithm for identifying redundant non-same-type requirements #36109

GSB: New algorithm for identifying redundant non-same-type requirements #36109

Uh oh!

slavapestov commented Feb 23, 2021 •

edited

Loading

Uh oh!

slavapestov commented Mar 1, 2021

Uh oh!

This comment has been minimized.

This comment has been minimized.

slavapestov commented Mar 1, 2021

Uh oh!

slavapestov commented Mar 2, 2021

Uh oh!

Uh oh!

GSB: New algorithm for identifying redundant non-same-type requirements #36109

GSB: New algorithm for identifying redundant non-same-type requirements #36109

Uh oh!

Conversation

slavapestov commented Feb 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Next steps

Uh oh!

slavapestov commented Mar 1, 2021

Uh oh!

This comment has been minimized.

This comment has been minimized.

slavapestov commented Mar 1, 2021

Uh oh!

slavapestov commented Mar 2, 2021

Uh oh!

Uh oh!

slavapestov commented Feb 23, 2021 •

edited

Loading