-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Retrieve resolved package versions in parallel #8203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieve resolved package versions in parallel #8203
Conversation
5ce5aec
to
3be985a
Compare
3be985a
to
1d71bc0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very nice improvement, thanks!
@swift-ci test |
As always, I worry about missing dependencies when making an algorithm run in parallel. Are we sure nothing will try and fetch the results before they are ready? |
Ah, I see we await until they're all done before we continue. Cool. |
I have to revert this. The checkoutRepository calls to save the workspace but also updates the state and there's no locking around the workspace state data causing a crash. Until we ensure only one thread is changing the workspace state at a time, we can't do this in parallel. |
This reverts commit 0340bb1.
Reverts #8203 This is causing crashes in SwiftPM during the parallelized checkout. The checkouts update the workspace state and then save it. The workspace save data is not protected against multi-thread access while the save is walking the data which is causing the crash.
Retrieve resolved package versions in parallel to improve package resolve performance ### Motivation: When running `swift package resolve`, resolved package versions are retrieved sequentially. This is not a big problem when using source control resolution as in that case, the deep clones are [done in parallel](https://github.com/swiftlang/swift-package-manager/blob/0401a2ae55077cfd1f4c0acd43ae0a1a56ab21ef/Sources/Workspace/Workspace%2BDependencies.swift#L367) when SwiftPM tries to resolve the required versions. However, when resolving packages using registry, the download of the source archive happens after the required version is resolved. That happens sequentially: https://github.com/swiftlang/swift-package-manager/blob/0401a2ae55077cfd1f4c0acd43ae0a1a56ab21ef/Sources/Workspace/Workspace%2BDependencies.swift#L422 ### Modifications: Similarly to [resolving package containers](https://github.com/swiftlang/swift-package-manager/blob/0401a2ae55077cfd1f4c0acd43ae0a1a56ab21ef/Sources/Workspace/Workspace%2BDependencies.swift#L367), I updated the retrieval of resolved package versions to be executed in parallel as well. ### Result: Below, you can find the found performance improvements for registry-driven resolution based on my (admittedly limited) performance measurements. For testing, I used the following `Package.swift`: ```swift // swift-tools-version: 6.0 import PackageDescription let package = Package( name: "Library", dependencies: [ .package(url: "https://github.com/pointfreeco/swift-composable-architecture", exact: "1.15.1"), .package(url: "https://github.com/apple/swift-syntax.git", exact: "509.0.1"), .package(url: "https://github.com/apple/swift-algorithms.git", exact: "1.2.0"), .package(url: "https://github.com/apple/swift-numerics.git", exact: "1.0.2"), .package(url: "https://github.com/apple/swift-async-algorithms.git", exact: "1.0.1"), ] ) ``` #### Without changes from this PR ``` hyperfine --prepare 'rm -rf ~/.swiftpm .build && swift package purge-cache' --runs 5 --warmup 1 'swift package --replace-scm-with-registry resolve' Benchmark 1: swift package --replace-scm-with-registry resolve Time (mean ± σ): 27.671 s ± 4.137 s [User: 8.897 s, System: 3.273 s] Range (min … max): 24.793 s … 34.946 s 5 runs ``` #### With changes from this PR ``` hyperfine --prepare 'rm -rf ~/.swiftpm .build && swift package purge-cache' --runs 5 --warmup 1 '/Users/marekfort/Developer/swift-package-manager/.build/release/swift-package --replace-scm-with-registry resolve' Benchmark 1: /Users/marekfort/Developer/swift-package-manager/.build/release/swift-package --replace-scm-with-registry resolve Time (mean ± σ): 11.927 s ± 0.223 s [User: 8.832 s, System: 3.411 s] Range (min … max): 11.705 s … 12.232 s 5 runs ``` The improvements of the mean resolve time are over 60 %! If you suspect I tested anything wrong, please, let me know. I also tested source control-based resolution, there the difference is less start as the deep clones are already happening in parallel on `main`. Given the small differences, they very well might be based on inconsistent network performance. #### Without changes from this PR ``` hyperfine --prepare 'rm -rf ~/.swiftpm .build && swift package purge-cache' --runs 5 --warmup 1 'swift package resolve' Benchmark 1: swift package resolve Time (mean ± σ): 19.957 s ± 2.606 s [User: 21.070 s, System: 5.866 s] Range (min … max): 17.617 s … 24.409 s 5 runs ``` #### With changes from this PR ``` hyperfine --prepare 'rm -rf ~/.swiftpm .build && swift package purge-cache' --runs 5 --warmup 1 '/Users/marekfort/Developer/swift-package-manager/.build/release/swift-package resolve' Benchmark 1: /Users/marekfort/Developer/swift-package-manager/.build/release/swift-package resolve Time (mean ± σ): 17.914 s ± 0.867 s [User: 20.967 s, System: 6.409 s] Range (min … max): 17.093 s … 19.092 s 5 runs ``` The mean time is better by over 10 %, but as mentioned, the changes might be based on inconsistent network conditions. Note also that with the changes from this PR, the registry resolution is significantly faster than when using source control. This is what one would expect as registry resolution is downloading a whole lot less data. One side-effect of this PR is that since we fetch packages in parallel, the output itself is not sequential as well. See the below example output. #### Without changes from this PR ``` swift package resolve --replace-scm-with-registry Fetching https://github.com/swiftlang/swift-syntax Fetched https://github.com/swiftlang/swift-syntax from cache (2.89s) Fetching pointfreeco.swift-composable-architecture Fetched pointfreeco.swift-composable-architecture from cache (2.21s) Fetching pointfreeco.swift-concurrency-extras Fetched pointfreeco.swift-concurrency-extras from cache (0.43s) Fetching pointfreeco.swift-navigation Fetched pointfreeco.swift-navigation from cache (0.49s) Fetching apple.swift-async-algorithms Fetched apple.swift-async-algorithms from cache (0.55s) Fetching pointfreeco.xctest-dynamic-overlay Fetched pointfreeco.xctest-dynamic-overlay from cache (0.46s) ... ``` #### With changes from this PR ``` Fetching https://github.com/swiftlang/swift-syntax Fetched https://github.com/swiftlang/swift-syntax from cache (2.89s) Fetching apple.swift-numerics Fetching pointfreeco.swift-identified-collections Fetching pointfreeco.swift-navigation Fetching pointfreeco.xctest-dynamic-overlay Fetching pointfreeco.swift-case-paths Fetching pointfreeco.swift-perception .... Fetching pointfreeco.combine-schedulers Fetched pointfreeco.swift-perception from cache (0.76s) Fetched pointfreeco.swift-navigation from cache (0.88s) Fetched apple.swift-async-algorithms from cache (0.89s) Fetched pointfreeco.swift-dependencies from cache (1.10s) Fetched pointfreeco.swift-case-paths from cache (1.11s) Fetched pointfreeco.combine-schedulers from cache (1.38s) ... ``` I personally find the output ok, but let me know what you think about it.
Reverts swiftlang#8203 This is causing crashes in SwiftPM during the parallelized checkout. The checkouts update the workspace state and then save it. The workspace save data is not protected against multi-thread access while the save is walking the data which is causing the crash.
Retrieve resolved package versions in parallel to improve package resolve performance
Motivation:
When running
swift package resolve
, resolved package versions are retrieved sequentially. This is not a big problem when using source control resolution as in that case, the deep clones are done in parallel when SwiftPM tries to resolve the required versions.However, when resolving packages using registry, the download of the source archive happens after the required version is resolved. That happens sequentially:
swift-package-manager/Sources/Workspace/Workspace+Dependencies.swift
Line 422 in 0401a2a
Modifications:
Similarly to resolving package containers, I updated the retrieval of resolved package versions to be executed in parallel as well.
Result:
Below, you can find the found performance improvements for registry-driven resolution based on my (admittedly limited) performance measurements.
For testing, I used the following
Package.swift
:Without changes from this PR
With changes from this PR
The improvements of the mean resolve time are over 60 %! If you suspect I tested anything wrong, please, let me know.
I also tested source control-based resolution, there the difference is less start as the deep clones are already happening in parallel on
main
. Given the small differences, they very well might be based on inconsistent network performance.Without changes from this PR
With changes from this PR
The mean time is better by over 10 %, but as mentioned, the changes might be based on inconsistent network conditions. Note also that with the changes from this PR, the registry resolution is significantly faster than when using source control. This is what one would expect as registry resolution is downloading a whole lot less data.
One side-effect of this PR is that since we fetch packages in parallel, the output itself is not sequential as well. See the below example output.
Without changes from this PR
With changes from this PR
I personally find the output ok, but let me know what you think about it.