-
Notifications
You must be signed in to change notification settings - Fork 12.2k
ci : switch cudatoolkit install on windows to networked #3236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think if you manage to get the install files cached it will be even faster ( the network method downloads about 230M of additional files besides that already cached ~30M ) |
@staviq they should all be pullrequests. remind me in a week to clear the caches :) |
I was testing a PR for this but you were faster :) I don't think the cache problem can be solved via PR, that seems to be a GitHub thing. |
:)
no, I meant, the old cudatoolkit caches are still used by PRs that have not pulled the workflow changes from master yet. |
Ok, it works now. The actual cuda installer just takes ~13mins according to CI logs. At this point, the only thing that can speed this up, is a runner with preinstalled cuda. I might be able to set one up (an actual HP server), but I'll know for sure tomorrow. |
In the meantime, cuda CI, and most CI jobs in fact, run cmake --build non-parallel. Adding Speeds up cuda CI build time by ~25% ( on my runner ), though github runners are only dual core from what I've found, so it might not make a difference with github hosted runners. I have my runner server set up with Windows Server 2022, and I'm testing it, but it turns out, even if I pre-install cuda, that I tried disabling cuda-toolkit action to use only the preinstalled cuda, and together with Except, that workflow uses env variables provided by cuda-toolkit action, and the final packaging step fails. So that's my current progress: |
as proposed in #3232 , this switches to the networked installer that should reduce the traffic. In my testing it reduces the time from 15-20min down to ~10min.
Even if it does not reduce the time by much, it reduces the cache size significantly.
run: https://github.com/Green-Sky/llama.cpp/actions/runs/6215910973/job/16869261559