Skip to content

[Dependency Scanning] Parallelize Clang module queries #76915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

artemcm
Copy link
Contributor

@artemcm artemcm commented Oct 8, 2024

This change refactors the top-level dependency scanning flow to follow the following procedure:

resolveImportedModuleDependencies():

  1. From the source target under scan, query all imported module identifiers for a Swift module. Leave unresolved identifiers unresolved. Proceed transitively to build a Swift module dependency graph.
  2. Take every unresolved import identifier in the graph from (1) and, assuming that it must be a Clang module, dispatch all of them to be queried in-parallel by the scanner's worker pool.
  3. Resolve bridging header Clang module dependencies
  4. Resolve all Swift overlay dependencies, relying on all Clang modules collected in (2) and (3)

Following resolveImportedModuleDependencies(), for the source target under scan, use all of the above discovered module dependencies to resolve all cross-import overlay dependencies.

Because operation (1) is typically quite fast (it does not take much work to discover Swift modules and their dependencies), this results in a much higher degree of parallelism available for operation (2). Note, that operations (4) and (5) themselves call into the top-level resolveImportedModuleDependencies for newly-discovered modules.

@artemcm
Copy link
Contributor Author

artemcm commented Oct 8, 2024

@swift-ci test

@artemcm artemcm force-pushed the AsyncScanExperiment branch 2 times, most recently from ba9e560 to 5510451 Compare October 9, 2024 21:09
@artemcm
Copy link
Contributor Author

artemcm commented Oct 9, 2024

@swift-ci test

@artemcm artemcm force-pushed the AsyncScanExperiment branch from 5510451 to 46eabb9 Compare October 9, 2024 22:01
@artemcm
Copy link
Contributor Author

artemcm commented Oct 9, 2024

@swift-ci test

@artemcm artemcm force-pushed the AsyncScanExperiment branch from 46eabb9 to 853a637 Compare October 9, 2024 22:38
@artemcm
Copy link
Contributor Author

artemcm commented Oct 9, 2024

@swift-ci test

@artemcm artemcm force-pushed the AsyncScanExperiment branch from 853a637 to 10a1ab0 Compare October 10, 2024 17:26
@artemcm
Copy link
Contributor Author

artemcm commented Oct 10, 2024

@swift-ci test

@artemcm
Copy link
Contributor Author

artemcm commented Oct 10, 2024

@swift-ci test macOS platform

@artemcm artemcm force-pushed the AsyncScanExperiment branch from 10a1ab0 to d17efc1 Compare October 17, 2024 21:20
@artemcm
Copy link
Contributor Author

artemcm commented Oct 17, 2024

@swift-ci test

@artemcm artemcm force-pushed the AsyncScanExperiment branch from d17efc1 to f54273b Compare October 29, 2024 21:03
@artemcm
Copy link
Contributor Author

artemcm commented Oct 29, 2024

@swift-ci test

This change refactors the top-level dependency scanning flow to follow the following procedure:

Scan():
1. From the source target under scan, query all imported module identifiers for a *Swift* module. Leave unresolved identifiers unresolved. Proceed transitively to build a *Swift* module dependency graph.
2. Take every unresolved import identifier in the graph from (1) and, assuming that it must be a Clang module, dispatch all of them to be queried in-parallel by the scanner's worker pool.
3. Resolve bridging header Clang module dpendencies
4. Resolve all Swift overlay dependencies, relying on all Clang modules collected in (2) and (3)
5. For the source target under scan, use all of the above discovered module dependencies to resolve all cross-import overlay dependencies
@artemcm artemcm force-pushed the AsyncScanExperiment branch from f54273b to 0f50693 Compare October 30, 2024 18:10
@artemcm
Copy link
Contributor Author

artemcm commented Oct 30, 2024

@swift-ci test

1 similar comment
@artemcm
Copy link
Contributor Author

artemcm commented Oct 31, 2024

@swift-ci test

@artemcm
Copy link
Contributor Author

artemcm commented Nov 1, 2024

@swift-ci test macOS platform

@artemcm
Copy link
Contributor Author

artemcm commented Nov 1, 2024

@swift-ci test Windows platform

Copy link
Contributor Author

@artemcm artemcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -180,7 +180,7 @@ class ModuleDependencyInfoStorageBase {
ArrayRef<LinkLibrary> linkLibraries,
StringRef moduleCacheKey = "")
: dependencyKind(dependencyKind), linkLibraries(linkLibraries),
moduleCacheKey(moduleCacheKey.str()), resolved(false),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we kept adding things to directDependencies and marked them resolved when done.
Now, instead, we keep track of dependencies of various kinds separately, meaning each dependency kind is only ever set once and therefore we no longer need to track which dependencies are "resolved".

@@ -505,10 +512,14 @@ class ClangImporter final : public ClangModuleLoader {
/// about new Clang modules discovered along the way.
///
/// \returns \c true if an error occurred, \c false otherwise
bool addHeaderDependencies(
bool getHeaderDependencies(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got refactored to instead return a result that the client can use to update the dependency scanner cache, instead of doing it in-place.

@artemcm artemcm requested a review from nkcsgexi November 5, 2024 17:43
@artemcm artemcm merged commit 8e3a620 into swiftlang:main Nov 5, 2024
5 checks passed
@artemcm artemcm deleted the AsyncScanExperiment branch November 5, 2024 18:04
Copy link
Contributor

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artemcm You didn't add me as reviewer so I never got the review request.

LGTM in general. One small comment is that it seems you can async dispatch scanning as you collect the clang modules, rather than do it together in the end.

I also doubt if ModuleDependenciesCache is thread safe but I haven't had time to reason it thoroughly.

@artemcm
Copy link
Contributor Author

artemcm commented Nov 5, 2024

@artemcm You didn't add me as reviewer so I never got the review request.

Ah, sorry about missing the review request button, I'd assumed you were on the list.

LGTM in general. One small comment is that it seems you can async dispatch scanning as you collect the clang modules, rather than do it together in the end.

I agree, I think that would be a good direction to pursue. The only reason I have not yet is that this approach seemed a bit simpler to implement, and the Swift scanning portion is really quite fast so my intuition is that it will make not that great a difference.

I also doubt if ModuleDependenciesCache is thread safe but I haven't had time to reason it thoroughly.

Queries/Updates to the ModuleDependenciesCache are done on the main scanner thread, unless I missed something, which motivated some of the refactoring here as well.

auto moduleName = moduleIdentifier.str();
{
std::lock_guard<std::mutex> guard(CacheAccessLock);
if (cache.hasDependency(moduleName, ModuleDependencyKind::Clang))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have clang module lookup here in the threads. I guess it should be fine since this is technically a const value for lookup only but I am a bit paranoid about the mistake can be made. Maybe do the cache lookup before async and don't capture cache at all will make it less error prone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants