Skip to content

Accelerate via @actions/cache #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 18, 2021
Merged

Accelerate via @actions/cache #8

merged 3 commits into from
Feb 18, 2021

Conversation

dscho
Copy link
Member

@dscho dscho commented Feb 18, 2021

This change makes this Action truly useful: the first time we try to get a given build artifact, it will be downloaded from Azure Pipelines. But then it will be cached. And subsequent attempts will use the cached version and be a lot faster!

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

@dennisameling I would also love to have your 👀 over this PR...

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

I am currently testing this by running a full matrix here: https://github.com/dscho/setup-git-for-windows-sdk/runs/1925713769?check_suite_focus=true. Once that run is done (and so far, it looks pretty good, having cached 4 out of the 6 artifacts), I will push a new commit to the test branch that triggers another run: this one should use the cached artifacts and demonstrate just how much faster things become this way.

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

Wow, caching is not exactly cheap, either: the full 32-bit SDK takes ~2m45s to cache 914MB... in contrast, it only takes ~15s to cache the ~75MB of git-sdk-64-minimal (hit the gear on the upper right to "Show timestamps").

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

Yes!!! The subsequent run manages to restore the minimal SDK in 10s instead of the 1m4s in the uncached case.

For the git-artifacts.yml workflow, the most interesting comparison is the build-installers one: 1m vs ~5m30s.

In contrast, the full SDKs seem not to benefit all that much. A run from yesterday (without caching the just-downloaded artifacts) clocked the 32-bit and 64-bit full SDK at ~6m and ~8m20s, respectively. When restoring them from cache, the numbers are ~5m and ~8m15s, respectively.

Given these findings, I think we will want to disable the cache for the full SDKs by default because it does not really seem to give us a big bang for the buck.

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

This PR is now based on #9, which should be merged first.

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

And here is a run that exercises the full supported flavor/architecture matrix with the default caching.

I think this is good to go now.

@dennisameling
Copy link
Contributor

This looks great, but I'm not sure about one thing...

When restoring them from cache, the numbers are ~5m and ~8m15s, respectively.

For x86_64, the cache size looks rather large to me: Cache Size: ~1300 MB (1363005456 B), compared to the full download from Azure Pipelines (~500M):

100  497M    0  497M    0     0  14.1M      0 --:--:--  0:00:35 --:--:-- 15.8M
unzipping C:/git-sdk-64-full-sdk/tmpz0AltS/artifacts.zip

Are you sure that's correct?

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

When restoring them from cache, the numbers are ~5m and ~8m15s, respectively.

For x86_64, the cache size looks rather large to me: Cache Size: ~1300 MB (1363005456 B), compared to the full download from Azure Pipelines (~500M):

100  497M    0  497M    0     0  14.1M      0 --:--:--  0:00:35 --:--:-- 15.8M
unzipping C:/git-sdk-64-full-sdk/tmpz0AltS/artifacts.zip

Are you sure that's correct?

Yes, that's correct. The full SDK is actually not just a .zip file, it is a .tar.xz file within a .zip file. That's a deviation from all the other flavors, specifically so that the bandwidth is used more efficiently (I tried to strike a balance between download and extraction time).

This opt-out feature uses the cache, so that the files are cached the
first time a given artifact from a given buildId is downloaded.
Subsequent attempts to download the same artifact with the same buildId
should become much quicker as a consequence.

Signed-off-by: Johannes Schindelin <[email protected]>
Some testing reveals that the 64-bit and the 32-bit `full` SDK weigh in
with ~1,300MB and ~915MB at time of writing, respectively. Packing and
saving these to the cache takes a long time: ~5m20s and ~2m45s.

Given that the non-cached download takes ~8m20s and ~6m, respectively,
and restoring the artifacts from cache takes ~8m15s and ~5m (which is
not a lot faster, not by a long stretch), it is neither worth the time
nor the disk space to cache the `full` SDKs.

Let's introduce a new default value for the input parameter `cache`,
`auto`, that turns on caching except for the `full` flavor.

Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

Rebased (and regenerated dist/), to avoid the merge conflicts in dist/index.js.

@dscho
Copy link
Member Author

dscho commented Feb 18, 2021

This looks great, but I'm not sure about one thing...

With the question why 500MB are equivalent to 1,300MB out of the way, I interpret this as your 👍 to merging this ;-)

@dscho dscho merged commit 0286ae1 into git-for-windows:main Feb 18, 2021
@dscho dscho deleted the cache branch February 18, 2021 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants