-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Dependency Scanning] Refactor the scanner to resolve unqualified module imports #62696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dependency Scanning] Refactor the scanner to resolve unqualified module imports #62696
Conversation
0ccb933
to
5334af0
Compare
@swift-ci test |
5334af0
to
6f15a3d
Compare
@swift-ci test |
I have a test project with a large target with 1665 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Hoisting the cache query above all importers makes a lot of sense.
|
||
/// Record dependencies for the given module. | ||
const ModuleDependencyInfo *recordDependencies(StringRef moduleName, | ||
const ModuleDependencyInfo *recordDependency(StringRef moduleName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm not quite following is the de-pluralization of these method names. Aren't there potentially multiple dependencies of moduleID
in the given ModuleDependencyInfo
collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While there are, indeed, multiple dependencies of moduleID
in the given ModuleDependencyInfo
, the thing being recorded here is the existence of a ModuleDependencyInfo
itself for this moduleID
. Which does contain a collection of its own dependencies' IDs, each of which will then correspond to a ModuleDependencyInfo
to be queried via these same API, but also a bunch of other metadata we need to know about this dependency.
I can see the logic behind the pluralization but it's been bugging me for a while and it seems to be clearer to be explicit about the fact that these API are vending these info values that describe a particular module, not just that module's dependencies.
For example, before this patch hasDependencies("foo")
would indicate whether or not a module "foo"
has been discovered as a dependency during this scan. To your point, it does also indicate that we know dependencies of foo
in its respective info, but the use of these queries is broader than querying dependencies of this module as it also covers getting a dependency info about this module for other purposes as well. It seems conceptually more-accurate to have some thing like findDependencies("foo")
return an actual list of dependencies of foo
, rather than an info describing foo
as a discovered dependency itself.
I'm having to really make an argument here so I can see it either way and can be convinced otherwise, but this seemed more appropriate as I was already restructuring things!
6f15a3d
to
5ee17bc
Compare
@swift-ci test |
5ee17bc
to
311df57
Compare
@swift-ci test |
@swift-ci test macOS platform |
1 similar comment
@swift-ci test macOS platform |
@swift-ci test macOS platform |
311df57
to
1343254
Compare
@swift-ci test |
…ule imports This changes the scanner's behavior to "resolve" a discovered module's dependencies to a set of Module IDs: module name + module kind (swift textual, swift binary, clang, etc.). The 'ModuleDependencyInfo' objects that are stored in the dependency scanner's cache now carry a set of kind-qualified ModuleIDs for their dependencies, in addition to unqualified imported module names of their dependencies. Previously, the scanner's internal state would cache a module dependnecy as having its own set of dependencies which were stored as names of imported modules. This led to a design where any time we needed to process the dependency downstream from its discovery (e.g. cycle detection, graph construction), we had to query the ASTContext to resolve this dependency's imports, which shouldn't be necessary. Now, upon discovery, we "resolve" a discovered dependency by executing a lookup for each of its imported module names (this operation happens regardless of this patch) and store a fully-resolved set of dependencies in the dependency module info. Moreover, looking up a given module dependency by name (via `ASTContext`'s `getModuleDependencies`) would result in iterating over the scanner's module "loaders" and querying each for the module name. The corresponding modules would then check the scanner's cache for a respective discovered module, and if no such module is found the "loader" would search the filesystem. This meant that in practice, we searched the filesystem on many occasions where we actually had cached the required dependency, as follows: Suppose we had previously discovered a Clang module "foo" and cached its dependency info. -> ASTContext.getModuleDependencies("foo") --> (1) Swift Module "Loader" checks caches for a Swift module "foo" and doesn't find one, so it searches the filesystem for "foo" and fails to find one. --> (2) Clang Module "Loader" checks caches for a Clang module "foo", finds one and returns it to the client. This means that we were always searching the filesystem in (1) even if we knew that to be futile. With this change, queries to `ASTContext`'s `getModuleDependencies` will always check all the caches first, and only delegate to the scanner "loaders" if no cached dependency is found. The loaders are then no longer in the business of checking the cached contents. To handle cases in the scanner where we must only lookup either a Swift-only module or a Clang-only module, this patch splits 'getModuleDependencies' into an alrady-existing 'getSwiftModuleDependencies' and a newly-added 'getClangModuleDependencies'.
1343254
to
12477b7
Compare
@swift-ci test |
Apologies for the size of this PR, the changes here are somewhat-intertwined refactoring of how the scanner works overall.
This patch changes the scanner's behavior to "resolve" a discovered module's dependencies to a set of Module IDs: module name + module kind (swift textual, swift binary, clang, etc.).
The
ModuleDependencyInfo
objects that are stored in the dependency scanner's cache now carry a set of kind-qualified ModuleIDs for their dependencies, in addition to unqualified imported module names of their dependencies.Previously, the scanner's internal state would cache a module dependnecy as having its own set of dependencies which were stored as names of imported modules. This led to a design where any time we needed to process the dependency downstream from its discovery (e.g. cycle detection, graph construction), we had to query the
ASTContext
to resolve this dependency's imports to what kind of module they are (Swift
,Clang
, etc), which shouldn't be necessary. Now, upon discovery, we "resolve" a discovered dependency by executing a lookup for each of its imported module names (this operation happens regardless of this patch) and store a fully-resolved set of dependencies in the dependency module info.Moreover, looking up a given module dependency by name (via
ASTContext
'sgetModuleDependencies
)(either for the first time, or to qualify its kind later) would result in iterating over the scanner's module "loaders" and querying each for the module name. The corresponding modules would then check the scanner's cache for a respective discovered module, and if no such module is found the "loader" would search the filesystem.This meant that in practice, we searched the filesystem on many occasions where we actually had cached the required dependency, as follows:
Suppose we had previously discovered a Clang module
foo
and cached its dependency info.->
ASTContext.getModuleDependencies("foo")
---> (1) Swift Module "Loader" checks caches for a Swift module
foo
and doesn't find one, so it searches the filesystem forfoo
and fails to find one there also.---> (2) Clang Module "Loader" checks caches for a Clang module
foo
, finds one and returns it to the client.This means that we were always searching the filesystem in (1) even if we knew all the information required to deem that to be futile.
With this change, queries to
ASTContext
'sgetModuleDependencies
will always check all the caches first, and only delegate to the scanner "loaders" if no cached dependency is found. The loaders are then no longer in the business of checking the cached contents.To handle cases in the scanner where we must only lookup either a Swift-only module or a Clang-only module, this patch splits
getModuleDependencies
into an already-existinggetSwiftModuleDependencies
and a newly-addedgetClangModuleDependencies
.Note, in expectation this change does not alter the scanner's functionality. The test changes here are mostly to make the tests more resilient to the ordering in which dependency information is emitted.