Skip to content

RequirementMachine: Improved handling of "identity conformances" [P].[P] => [P] #39918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Oct 27, 2021

Conversation

slavapestov
Copy link
Contributor

@slavapestov slavapestov commented Oct 26, 2021

This is just a big yak shave to get a pathological protocol definition by @jckarter to minimize correctly:

protocol O {
   associatedtype R : O
   associatedtype C : O
     where C.C.C.C == Self,
           C.C.R.C.C.R == Self,
           R.C.R.C.R.C == Self
 }

The correct minimized signature here is

<Self where Self == Self.C.C.C.C, Self.C : O, Self.R : O, Self.C.C == Self.R.C.C.R, Self.C.R.C == Self.R.C.R>

While getting it to work, I uncovered some related issues, and fixed a few other bugs in the process.

Motivation

Consider this example:

protocol P {
  associatedtype X : P where X == Self
}

Clearly the conformance requirement X : P is redundant (and the GenericSignatureBuilder agrees), but previously nothing in the requirement machine's theory would make it so.

Here is the rewrite system:

(1) [P:X].[P] => [P:X]
(2) [P:X] => [P]

These two terms overlap on [P:X].[P]; resolving the critical pair introduces the 'identity conformance' [P].[P] => [P]. Homotopy reduction would delete this conformance, but at this point, [P:X].[P] => [P:X] would no longer be redundant, since nothing else proves that [P].[P] => [P].

Now, [P].[P] => [P] is a permanent rule, and we can handle this properly; any conformance that appears in a rewrite loop together with an identity conformance without context is completely redundant; it is equivalent to the empty generating conformance path.

The concrete type problem

However, this revealed a weakness in how completion handles concrete type substitutions. Suppose we have these rules:

(1) [P].[P] => [P]
(2) [P].[P:X] => [P:X]
(3) [P].[P:Y] => [P:Y]
(4) [P:X].[concrete: G<τ_0_0> with <[P:Y]>]

Rule (2) and (4) overlap on the following term, which has had the concrete type adjustment applied, transforming [P:Y] into [P].[P:Y] by prepending the prefix [P]:

[P].[P:X].[concrete: G<τ_0_0> with <[P].[P:Y]>]

The critical pair is obtained by applying rule (2) to both sides of the branching is

[P:X].[concrete: G<τ_0_0> with <[P].[P:Y]>] => [P:X]

Note that this is a distinct rule from (4), and again this new rule overlaps with (2), producing another rule

[P:X].[concrete: G<τ_0_0> with <[P].[P].[P:Y]>] => [P:X]

This process doesn't terminate. The root cause of this problem is the existence of rule (1).

To make this work, we need to simplify concrete substitutions when adding a new rule in completion.

The solution

We need to simplify concrete substitutions when adding a rewrite rule, so for example if X.Y => Z, we want to simplify

A.[concrete: G<τ_0_0, τ_0_1> with <X.Y, X>] => A

to

A.[concrete: G<τ_0_0, τ_0_1> with <Z, X>] => A

The requirement machine used to do this, but I took it out, because it didn't fit with the rewrite path representation. Until now, a rewrite path was a composition of rewrite steps where each step would transform a source term into a destination term.

The question then becomes, how do we represent concrete substitution simplification with such a scheme.

One approach is to rework rewrite paths to a 'nested' representation, where a new kind of rewrite step applies a sequence of rewrite paths to the concrete substitution terms. Unfortunately this would complicate memory management and require recursion when visiting the steps of a rewrite path.

Another, simpler approach that I'm going with here is to generalize a rewrite path to a stack machine program instead.

I'm adding two new kinds of rewrite steps which manipulate a pair of stacks, called 'A' and 'B':

  • Decompose, which takes a term ending in a superclass or concrete type symbol, and pushes each concrete substitution on the 'A' stack.

  • A>B, which pops the top of the 'A' stack and pushes it onto the 'B' stack.

Since all rewrite steps are invertible, the inverse of the two new step kinds are as follows:

  • Compose, which pops a series of terms from the 'A' stack, and replaces the concrete substitutions in the term ending in a superclass or concrete type symbol underneath.

  • B>A, which pops the top of the 'B' stack and pushes it onto the 'B' stack.

Both Decompose and Compose take an operand, which is the number of concrete substitutions to expect. This is encoded in the RuleID field of RewriteStep.

The two existing rewrite steps ApplyRewriteRule and AdjustConcreteType simply pop and push the term at the top of the 'A' stack.

Now, if addRule() wishes to transform

A.[concrete: G<τ_0_0, τ_0_1> with <X.Y, X>] => A

into

A.[concrete: G<τ_0_0, τ_0_1> with <Z, X>] => A

it can construct the rewrite path

Decompose(2) ⊗ A>B ⊗ <<rewrite path from X.Y to Z>> ⊗ B>A ⊗ Compose(2)

The existing 'evaluation' walks over rewrite paths that used to mutate a single MutableTerm now use a new RewritePathEvaluator type, that stores the two stacks. The definition of a rule appearing "in empty context" for homotopy reduction's purposes is now narrower; in addition to the rewrite step having empty whiskers, the evaluator at this point must also have an 'A' stack consisting of exactly one entry, and an empty 'B' stack.

@slavapestov slavapestov force-pushed the rqm-identity-conformance branch 3 times, most recently from 5c30182 to ee090c9 Compare October 26, 2021 03:27
@slavapestov slavapestov marked this pull request as ready for review October 26, 2021 03:28
@slavapestov slavapestov force-pushed the rqm-identity-conformance branch from ee090c9 to dcec494 Compare October 27, 2021 01:50
…cking completion depth limit

We didn't look at the length of terms appearing in concrete substitutions,
so runaway recursion there was only caught by the completion step limit
which takes much longer.
…requirement-machine=completion

If we're not recording homotopy generators, we can't access
HomotopyGenerators.back() in the debug output.
…chine

I need to simplify concrete substitutions when adding a rewrite rule, so
for example if X.Y => Z, we want to simplify

  A.[concrete: G<τ_0_0, τ_0_1> with <X.Y, X>] => A

to

  A.[concrete: G<τ_0_0, τ_0_1> with <Z, X>] => A

The requirement machine used to do this, but I took it out, because it
didn't fit with the rewrite path representation. However, I found a case
where this is needed so I need to bring it back.

Until now, a rewrite path was a composition of rewrite steps where each
step would transform a source term into a destination term.

The question then becomes, how do we represent concrete substitution
simplification with such a scheme.

One approach is to rework rewrite paths to a 'nested' representation,
where a new kind of rewrite step applies a sequence of rewrite paths to
the concrete substitution terms. Unfortunately this would complicate
memory management and require recursion when visiting the steps of a
rewrite path.

Another, simpler approach that I'm going with here is to generalize a
rewrite path to a stack machine program instead.

I'm adding two new kinds of rewrite steps which manipulate a pair of
stacks, called 'A' and 'B':

- Decompose, which takes a term ending in a superclass or concrete type
  symbol, and pushes each concrete substitution on the 'A' stack.

- A>B, which pops the top of the 'A' stack and pushes it onto the 'B'
  stack.

Since all rewrite steps are invertible, the inverse of the two new
step kinds are as follows:

- Compose, which pops a series of terms from the 'A' stack, and replaces
  the concrete substitutions in the term ending in a superclass or
  concrete type symbol underneath.

- B>A, which pops the top of the 'B' stack and pushes it onto the
  'B' stack.

Both Decompose and Compose take an operand, which is the number of
concrete substitutions to expect. This is encoded in the RuleID field
of RewriteStep.

The two existing rewrite steps ApplyRewriteRule and AdjustConcreteType
simply pop and push the term at the top of the 'A' stack.

Now, if addRule() wishes to transform

  A.[concrete: G<τ_0_0, τ_0_1> with <X.Y, X>] => A

into

  A.[concrete: G<τ_0_0, τ_0_1> with <Z, X>] => A

it can construct the rewrite path

  Decompose(2) ⊗ A>B ⊗ <<rewrite path from X.Y to Z>> ⊗ B>A ⊗ Compose(2)

This commit lays down the plumbing for these new rewrite steps, and
replaces the existing 'evaluation' walks over rewrite paths that
mutate a single MutableTerm with a new RewritePathEvaluator type, that
stores the two stacks.

The changes to addRule() are coming in a subsequent commit.
…ite path

This is for convenience with the subsequent change to addRule().
… rule

Suppose we have these rules:

  (1) [P].[P] => [P]
  (2) [P].[P:X] => [P:X]
  (3) [P].[P:Y] => [P:Y]
  (4) [P:X].[concrete: G<τ_0_0> with <[P:Y]>]

Rule (2) and (4) overlap on the following term, which has had the concrete
type adjustment applied:

  [P].[P:X].[concrete: G<τ_0_0> with <[P].[P:Y]>]

The critical pair is obtained by applying rule (2) to both sides of the
branching is

  [P:X].[concrete: G<τ_0_0> with <[P].[P:Y]>] => [P:X]

Note that this is a distinct rule from (4), and again this new rule overlaps
with (2), producing another rule

  [P:X].[concrete: G<τ_0_0> with <[P].[P].[P:Y]>] => [P:X]

This process doesn't terminate. The root cause of this problem is the
existence of rule (1), which appears when there are multiple non-trivial
conformance paths witnessing the conformance Self : P. This occurs when a
same-type requirement is defined between Self and some other type conforming
to P.

To make this work, we need to simplify concrete substitutions when adding a
new rule in completion. Now that rewrite paths can represent this form of
simplification, this is easy to add.
…conformance

Completion adds this rule when a same-type requirement equates the Self of P
with some other type parameter conforming to P; one example is something like this:

  protocol P {
    associatedtype T : P where T.T == Self
  }

The initial rewrite system looks like this:

  (1) [P].T => [P:T]
  (2) [P:T].[P] => [P:T]
  (3) [P:T].[P:T] => [P]

Rules (2) and (3) overlap on the term

  [P:T].[P:T].[P]

Simplifying the overlapped term with rule (2) produces

  [P:T].[P:T]

Which further reduces to

  [P]

Simplifying the overlapped term with rule (3) produces

  [P].[P]

So we get a new rule

  [P].[P] => [P]

This is not a "real" conformance rule, and homotopy reduction wastes work
unraveling it in rewrite loops where this rule occurs. Instead, it's better
to introduce it as a permanent rule that is not subject to homotopy reduction
for every protocol.
Instead of finding the best redundancy candidate from the first
homotopy generator that has one, find the best redundancy
candidate from *all* homotopy generators that have one.

This correctly minimizes the "Joe Groff monoid" without hitting
the assertion guarding against simplified rules being non-redundant.

It also brings us closer to being able to correctly minimize
the protocol from https://bugs.swift.org/browse/SR-7353.
Now that the 'identity conformance' [P].[P] => [P] is permanent,
we won't see it here, so we can just assume that any non-permanent,
non-redundant rule maps to a requirement.
…puting generating conformances

Consider this example:

  protocol P {
    associatedtype X : P where X == Self
  }

Clearly the conformance requirement 'X : P' is redundant, but previously
nothing in our formulation would make it so.

Here is the rewrite system:

  (1) [P:X].[P] => [P:X]
  (2) [P:X] => [P]

These two terms overlap on [P:X].[P]; resolving the critical pair introduces
the 'identity conformance' [P].[P] => [P]. Homotopy reduction would delete
this conformance, but at this point, [P:X].[P] => [P:X] would no longer be
redundant, since nothing else proves that [P].[P] => [P].

Now that [P].[P] => [P] is a permanent rule, we can handle this properly;
any conformance that appears in a rewrite loop together with an identity
conformance without context is completely redundant; it is equivalent to the
empty generating conformance path.
@slavapestov slavapestov force-pushed the rqm-identity-conformance branch from dcec494 to c9abfa3 Compare October 27, 2021 04:45
@slavapestov
Copy link
Contributor Author

@swift-ci Please smoke test

Just like in homotopy reduction, when finding a minimal set of generating
conformances, we should try to eliminate less canonical conformance rules
first.

This fixes a test case where we would delete a more canonical conformance
requirement, triggering an assertion that simplifed rules must be
redundant, because the less canonical conformance was simplified, whereas
the more canonical one was redundant and not simplified.
Also, stop talking about 2-cells and 3-cells.
Signed vs unsigned bitfields strike again.
@slavapestov slavapestov force-pushed the rqm-identity-conformance branch from dc142e7 to 5067bfe Compare October 27, 2021 05:28
@slavapestov
Copy link
Contributor Author

@swift-ci Please smoke test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant