utils/update-checkout: Rework for more parallelism in updates #7325
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reworks the update_all_repositories() and update_single_repository()
functions to benefit from more per-repo parallelism.
to update_all_repositories().
are removed.
outer try clause to catch more errors.
appropriately.
to keep update_single_repository()'s length under control.
removed, so repos which lack the tag are now properly rebased when updated.
In PR #7181 (which was eventually rejected in favor of a different patch), @erg suggested that I look into further parallelizing update-checkout's --tag option, since it still did some work in serial form. When I looked further, I saw the same was true for --scheme and cross-repo-PR processing, and @erg encouraged me to parallelize everything that made sense. I decided to rework repository updating, keeping all the common work in update_all_repositories() and performing all the per-repo work in update_single_repository() or subfunctions called from it. This allowed for some simplification and reduction of repeated work. I also took the opportunity to remove the setting of the cross_repo flag from --tag processing, since @erg had said it was only a workaround for rebasing bugs, and PR #7232 (now merged) is a better solution.
The number of arguments passed to update_single_repository() has increased from five to eight; this was necessary to get all the needed info into the reworked function without repeating work or resorting to making some variables (like config) global.
This is obviously a significant patch. I've been using it on my own system for about a day to perform repo updates both from the main repo and my private fork, also exercing various functions like --tag, --scheme, --reset-to-remote, --clean, and cross-repo checkout with --github-comment. I haven't seen any problems.
Along with @erg, I'd also like @gottesmm and @shahmishal to critique this, since it looks like update-checkout was largely made by you two, and since @shahmishal seems to be in charge of the CI testing infrastructure which uses it.