-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[GSB] Term rewriting for same-type constraints #14454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Introduce a new representation of same-type constraints as rewrite rules in a term-rewriting system, where each same-type constraint maps to a rewrite rule that produces a "more canonical" term. Fully-simplifying a type according to these rewrite rules will produce the canonical representation of that type, i.e., the "anchor" of its equivalence class. The rewrite rules are stored as a prefix tree, such that a "match" operation walks the tree matching the prefix of a path of associated type references, e.g., SubSequence -> SubSequence -> Element, and any node within the tree can provide a replacement path (e.g., "Element"). The "dump" operation provides a visualization of the tree, e.g., :: `--(cont'd) `--Sequence.Iterator | `--IteratorProtocol.Element --> [Sequence.Element] `--Sequence.SubSequence --> [] | `--Sequence.Element --> [Sequence.Iterator -> IteratorProtocol.Element] | | `--(cont'd) --> [Sequence.Element] | `--Sequence.Iterator --> [Sequence.Iterator] | | `--IteratorProtocol.Element --> [Sequence.Iterator -> IteratorProtocol.Element] | `--Sequence.SubSequence --> [] | | `--Sequence.Element --> [Sequence.Element] | | `--Sequence.Iterator --> [Sequence.Iterator] | `--C.Index --> [C.Index] | `--C.Indices --> [C.Indices] | `--Sequence.Element --> [C.Indices -> Sequence.Element] | `--Sequence.SubSequence --> [C.Indices -> Sequence.SubSequence] | `--C.Index --> [C.Indices -> C.Index] | `--C.Indices --> [C.Indices -> C.Indices] `--C.Index --> [Sequence.Element] `--C.Indices `--Sequence.Element --> [C.Index] Thus far, there are no clients for this information: this commit starts building these rewriting trees and can visualize them for debug information.
Fully rewriting a given type will produce the canonical representation of that type, because all rewrite rules take a step toward a more-canonical type. Use this approach to compute the anchor of an equivalence class, replacing the existing ad hoc approach of enumerating known potential archetypes—which was overly dependent on having the “right” set of potential archetypes already computed.
Previously, the term rewriting logic for same-type constraints was only able to express “relative” rewrites, where the base of the type being rewritten (i.e., the generic type at the root) is unchanged. This meant that we were unable to capture same-type constraints such as “T == U” or “T.Foo == T” within the rewriting system. Separate out the notion of a rewrite path and extend it with an optional new base. When we’re simplifying a term and we encounter one of these replacements, the whole type from the base up through the last-matched path component gets replaced with the replacement path.
Use term rewriting exclusively in the computation of the canonical dependent type (anchor) for an equivalence class, eliminating any dependence on the specific potential archetypes and eliminating the hacks around generic type parameters.
45924bc
to
1e7562a
Compare
@swift-ci please smoke test |
@swift-ci please test source compatibility |
@swift-ci please test compiler performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's remarkably... neat and simple! Very cool.
I do notice a lack of tests though; is it not quite working fully?
/// types. | ||
Optional<RewritePath> static createPath(Type type); | ||
|
||
/// Decompose a potential archetype into a patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(s/patch/path/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks. A later PR (#14685) is going to delete this code (yay!)
/// with concrete declarations. | ||
Optional<RewritePath> static createPath(PotentialArchetype *pa); | ||
|
||
/// Compute the common path between this path and \c other, if one exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Common path being the longest prefix? What's the conditions under which one doesn't exist? It looks like it's the bases being different from the code. Why's that case different to a path being empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the longest common prefix. The failure case is when the bases are different... but you're right that I can express this as a completely-empty path.
|
||
/// Retrieve the base of the given rewrite path. | ||
/// | ||
/// When present, it indicates that the entire path will be rebased on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what's it mean when absent? A floating path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, "floating" or "relative".
void enumerateRewritePaths( | ||
RelativeRewritePath matchPath, | ||
llvm::function_ref<void(unsigned, RewritePath)> callback, | ||
unsigned depth = 0) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do callers need to do anything with depth
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm being lazy; I'll split it out.
SmallVector<AssociatedTypeDecl *, 4> path(initialPath->getPath().begin(), | ||
initialPath->getPath().end()); | ||
bool simplified = false; | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking my understanding: this goes along a path like A.B.C.D
looking for rewrites first for B.C.D
within {A}
, then for C.D
within {A.B}
, etc., and if one succeeds (say, {A.B}.C.D
-> X.Y.Z
) it starts again? And, the loop terminates because the compareDependentPaths
/swap
calls in addSameTypeRewriteRule
ensure that the target of each rule is smaller in some sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct.
// Merge the second rewrite tree into the first. | ||
rewriteRoot2->mergeInto(rewriteRoot1); | ||
Impl->RewriteTreeRoots.erase(equivClass2); | ||
delete rewriteRoot2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious: since there's at least two of these manual clean-ups, unique_ptr
or similar mustn't work with DenseMap
(and how it's being used); why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of doing something clever with re-using root nodes, but I don't think it's worthwhile. I'll switch it over to using std::unique_ptr
for the roots.
@huonw, regarding your testing comment, now that we're computing all of the anchors of equivalence classes via this code path (and cough getting it right modulo the issues resolved by #14685), we're getting quite a lot of coverage for these code paths. I could maybe do some FileCheck-based testing to see that we're getting the expected rewrite trees for particular generic signatures, I guess. |
Introduce a new representation of same-type constraints as rewrite
rules in a term-rewriting system, where each same-type constraint maps
to a rewrite rule that produces a "more canonical"
term. Fully-simplifying a type according to these rewrite rules will
produce the canonical representation of that type, i.e., the "anchor"
of its equivalence class. Use this term-rewriting system to replace the
existing, weird "anchor" computation.
The rewrite rules are stored as a prefix tree, such that a "match"
operation walks the tree matching the prefix of a path of associated
type references, e.g., SubSequence -> SubSequence -> Element, and any
node within the tree can provide a replacement path (e.g.,
"Element"). The "dump" operation provides a visualization of the tree,
e.g.,