-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Evaluator] Online Request-Based Incremental Dependency Tracking #30723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Re-enabling the AnyObject test might sink us on Windows, but let's see @swift-ci test |
Hah, my assertion tripped an assertion. |
e11f54b
to
3b9756f
Compare
@swift-ci please smoke test |
@swift-ci test Windows |
@CodaFi Just read the description, prior to reviewing it. It would really help me to follow it if you could add an explanation of the terms "dependency source" and "dependency sink". Thanks! |
No description provided. |
@swift-ci test |
3b9756f
to
3a333f5
Compare
@swift-ci test compiler performance |
@swift-ci smoke test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a top-level requests for SIL optimization as well, otherwise we might miss dependency edges when optimizing code (it's tricky to come up with them, but, eg, conformance checks)
auto *unit = desc.context.dyn_cast<FileUnit *>(); | ||
return { | ||
dyn_cast_or_null<SourceFile>(unit), | ||
evaluator::DependencyScope::Cascading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should similarly flip over to private dependencies when we SILGen or IRGen a function body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I wonder if all dependencies recorded during SILGen and IRGen can be non-cascading. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a fine idea, but we need to get more tests up for dependencies discovered during those later phases. In fact, we need to get any tests for this since we currently emit reference dependencies after Sema.
// UNSUPPORTED: CPU=armv7k && OS=ios | ||
// Exclude iOS-based 32-bit platforms because the Foundation overlays introduce | ||
// an extra dependency on _KeyValueCodingAndObservingPublishing only for 64-bit | ||
// platforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you instead add --check-prefix=CHECK-%target-ptrsize to the RUN: line, and use CHECK-32/CHECK-64 to conditionalize this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get it in a follow-up. This test has been historically difficult to gate on our more exotic configurations.
Wow! I'm very excited by the payoff of this work! Thank you very much for adding the source and sink explanation. Now I understand. I have two thoughts, just about this text:
I'm not sure what the best way forward is... the easiest thing would be adding something to clarify the two terminologies somewhere, maybe? Anyway, submitted for your consideration... Thanks! |
docs/RequestEvaluator.md
Outdated
## Incremental Dependency Tracking | ||
Request evaluation naturally captures the dependency structure of any given invocation of the compiler frontend. In fact, it captures it so well that the request graph trace generated by a select kind of lookup request can be used to completely recover the information relevant to the Swift compiler's incremental compilation subsystem. For these select *dependency-relevant* requests, we can further subdivide them into so-called *dependency sources* and *dependency sinks*. A dependency source is any (usually high-level) request that introduces a new context under which dependencies can be registered. Currently, these are the requests that operate that the level of individual source files. A dependency sink is any (usually lower-level) request that executes as a sub-computation of a depenendecy source. Any names that are dependency-relevant, such as the result of a lookup in a particular context, are then registered against trackers in the active dependency source (file). Using this, the evaluator pushes and pops sources and sinks automatically as request evaluation proceeds, and sink requests pair automatically to source requests to write out dependency information. | ||
|
||
To define a request as a dependency source, it must implement an accessor for the new active scope (`readDependencySource`). To define a request as a dependency sink, it must implement a function that writes the result of evaluating the request into the current active source (`writeDependencySink`). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah!! I hadn't looked at the diffs before I wrote the comment on the conversation page. This is great! Maybe just add something like: "These terms apply when considering how the frontend finds and records uses for the dependency graph." In the wider context of the graph, a source could be considered to be a definition of an entity, and a sink could be a use, such that when the source changes, the use must be recompiled."
include/swift/AST/Evaluator.h
Outdated
template <typename Request, | ||
typename std::enable_if<!Request::isDependencySink>::type * = nullptr> | ||
void reportEvaluatedResult(const Request &r, | ||
const typename Request::OutputType &o) const {} | ||
|
||
template <typename Request, | ||
typename std::enable_if<Request::isDependencySink>::type * = nullptr> | ||
void reportEvaluatedResult(const Request &r, | ||
const typename Request::OutputType &o) { | ||
r.writeDependencySink(*this, o); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just read both of these over about four times, and I finally spotted the "const" difference. Sigh... C++. But now I'm wondering why it's OK to not write the sink if the receiver happens to be const
. Would it be possible to put in a comment explaining this? Is there a possibility of creating a bug by calling the first one when a write is really needed because of a bogus "const" on the receiver of the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice solution to a tricky problem overall! Clearly took lots of careful thought.
Two items stick in my mind:
-
To help future maintainers, add some verbiage in the documentation file explaining the different meanings of Dependency source and -sink.
-
I don't like passing a pointer to the ReferencedNameTracker into the dependency-graph-creation systems (e.g. emitReferenceDependencies, etc.). I want to preserve the restriction that the tracker used goes with the source file.
At first, I thought, well, just in a boolean instead of a pointer. But, after reading through all the diffs, would the following work?
If you can get from a SourceFile, through the ASTContext, to the LangOpts, you could just have a get member function on the source file to get the right tracker. Then, just use that member wherever the old code used to get the tracker pointer out of the source file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
49e71ac
to
3806b6a
Compare
Formalize DependencyScope, DependencySource, and the incremental dependency stack. Also specialize SimpleRequest to formalize dependency sources and dependency sinks. This allows the evaluator's internal entrypoints to specalize away the incremental dependency tracking infrastructure if a request is not actually dependency-relevant.
3806b6a
to
205a2b4
Compare
Request-based incremental dependencies are enabled by default. For the time being, add a flag that will turn them off and switch back to manual dependency tracking.
Plug high-level requests that define dependency sources into the evaluator's incremental infrastructure.
Convert most of the name lookup requests and a few other ancillary typechecking requests into dependency sinks. Some requests are also combined sinks and sources in order to emulate the current scheme, which performs scope changes based on lookup flags. This is generally undesirable, since it means those requests cannot immediately be generalized to a purely context-based scheme because they depend on some client-provided entropy source. In particular, the few callers that are providing the "known private" name lookup flag need to be converted to perform lookups in the appropriate private context. Clients that are passing "no known dependency" are currently considered universally incorrect and are outside the scope of the compatibility guarantees. This means that request-based dependency tracking registers strictly more edges than manual dependency tracking. It also means that once we fixup the clients that are passing "known private", we can completely remove these name lookup flags. Finally, some tests had to change to accomodate the new scheme. Currently, we go out of our way to register a dependency edge for extensions that declare protocol conformances. However, we were also asserting in at least one test that extensions without protocol conformances weren't registering dependency edges. This is blatantly incorrect and has been undone now that the request-based scheme is automatically registering this edge.
205a2b4
to
0879962
Compare
* Document a number of legacy conditions and edge cases * Add lexicon definitions for "dependency source", "dependency sink", "cascading dependency" and "private dependency"
0879962
to
ee723cd
Compare
I believe that's everything resolved. @swift-ci please smoke test |
⛵️ |
Split off the notion of "recording" dependencies from the notion of "collecting" dependencies. This corrects an oversight in the previous design where dependency replay and recording were actually not "free" in WMO where we actually never track dependencies. This architecture also lays the groundwork for the removal of the referenced name trackers. The algorithm builds upon the infrastructure for dependency sources and sinks laid down during the cut over to request-based dependency tracking in swiftlang#30723. The idea of the naive algorithm is this: For a chain of requests A -> B* -> C -> D* -> ... -> L where L is a lookup request and all starred requests are cached, once L writes into the dependency collector, the active stack is walked and at each cache-point the results of dependency collection are associated with the request itself (in this example, B* and D* have all the names L found associated with them). Subsequent evaluations of these cached requests (B* and D* et al) will then *replay* the previous lookup results from L into the active referenced name tracker. One complication is, suppose the evaluation of a cached request involves multiple downstream name lookups. More concretely, suppose we have the following request trace: A* -> B -> L | -> C -> L | -> D -> L | -> ... Then A* must see the union of the results of each L. If this reminds anyone of a union-find, that is no accident! A persistent union-find a la Conchon and Filliatre is probably in order to help bring down peak heap usage...
This pull request grants the evaluator the ability to register incremental dependencies as a consequence of request evaluation. This also cuts the compiler over to using the evaluator's name tracker as the source of dependency information, and provides the staging flag
-disable-request-based-incremental-dependencies
should we require the old scheme once again.The plan of attack is this:
Sources
A request marked as a dependency source declares that its evaluation defines a scope under which dependency information should be registered. As request evaluation proceeds, additional source requests will push their scopes onto the evaluator's stack, and have those scopes removed when their execution terminates.
A dependency source under the request-based system is equivalent to a change of decl context under the manual scheme. As such, there are very few requests outside of the very high-level ones like
TypeCheckSourceFileUntil
that can claim to be valid dependency sources, but they do exist. For example,ModuleQualifiedLookupRequest
switches its lookup context based on the decl context the lookup is performed from and writes into the referenced name tracker for the associated file (if any).Sinks
A request marked as a dependency sink declares that any reads of the information computed by the request have dependency information to write into the active scope. That is, the act of "reading" the result of the request completes a dependency edge from the active scope to the name being looked up.
Note that it is insufficient to instrument the cache points or the
evaluate
method alone for dependency sinks, thus we have a separatewrite
function to implement. This is because a request that populates the cache will not re-executeevaluate
to register the edge, even if the active scope has changed. Imagine a frontend run with two primaries that each try to fire off aSuperclassTypeRequest
if we did not instrument all reads. Theevaluate
kicked off by the first primary will warm the cache and register the edge, the secondevaluate
kicked off by the second primary will read the cache and skipevaluate
. Instrumenting the cache-points manually is also right out since there are internally-cached requests that have nothing to instrument!A dependency sink under the request-based system is associated one-to-one with a dependency-relevant use of a name. The ideal candidate for a dependency sink is therefore the lookup requests themselves - and indeed, lookup requests that register dependency edges manually have all been declared as sinks.
Exceptions
The tricky ones are the type checking requests that run in the primary file. Currently, redeclaration checking and protocol conformance checking register the bulk of provides edges in the file. They eventually bottom out in a lookup, but additional context is required to emulate the old scheme. So, to preserve compatibility, they are also declared as dependency sources to simulate the manual context change, and dependency sinks to write out the edges. We can and should scrap this in the future.
Effect on The Evaluator
This means that the model of request evaluation must be generalized to accommodate the source and sink points. For that, define a new dependency source stack in the evaluator and have request evaluation automatically push and pop scopes as needed (for dependency neutral requests, the template specialization ensures this operation is a no-op). For dependency sinks, all of the read-points of the evaluator (reading from the internal cache, external cache, and uncached evaluation) have been instrumented to call the dependency sink point (for dependency neutral requests, this is also a no-op because of template specialization).
Under this scheme, it should be the case that the only request that really flips the dependency scope from cascading to non-cascading is
TypeCheckFunctionBodyUntilRequest
. However, an attempt has been made at compatibility with the old manual scheme. Especially name lookup requests that use client-provided flags to register edges, will have those flags taken into consideration and so are also registered as dependency sources and sinks as appropriate.There are two notable places where those compatibility guarantees fail:
rdar://59773704, rdar://60050653