-
Notifications
You must be signed in to change notification settings - Fork 12.2k
ci : add LoRA test to CI #2650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci : add LoRA test to CI #2650
Conversation
It would be good to test LoRA with quantized models as well, both with and without a f16 |
I can easily increase it, but the runs will start to take forever eventually as we add more tests. I can deploy more nodes, so another solution is to group the tests into groups and have different nodes run different groups. To do that, we have to make Here are the current env variables on the CUDA node for example: We can add |
I have added a test with q8_0 only for now, hopefully it is not too slow. This is with CPU only, the CUDA backend only supports LoRA with f16 models. |
Looks like it didn't timeout. This should be good enough for now, we can add the rest of the quantized models once we figure the build groups. Some things to review:
|
Let's update this PR after #2398 merge and updating the |
ggml-ci
use 1 thread for CUDA generation tests ggml-ci
ggml-ci
I've bumped the CI timeout to 30 minutes. For now, we can keep just the |
* ci : add lora test ggml-ci * move lora summary to the top, add lora logs ggml-ci * ci : decrease CPU ppl runs to 2 to avoide 20 min timeout ggml-ci * add 7b lora test use 1 thread for CUDA generation tests ggml-ci * add test with q8_0 (cpu only) ggml-ci --------- Co-authored-by: Georgi Gerganov <[email protected]>
Downloads a LoRA trained on
shakespeare.txt
and compares the perplexity on this dataset with and without applying the LoRA.Only for 3B and f16 currently, if it looks ok I can try training another LoRA for 7B, and possibly add tests for quantized models.
Fixes #2634