Skip to content

metal: Cache compiled library at device level #12265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 11, 2025

Conversation

BB-fat
Copy link
Contributor

@BB-fat BB-fat commented Mar 8, 2025

Currently, Metal shaders are recompiled for every llama context initialization, which is redundant and impacts performance when creating multiple contexts.
Cache the compiled Metal library at the device context level (g_ggml_ctx_dev_main), reusing it for subsequent context initializations.

Fixes #12199

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Mar 8, 2025
@BB-fat BB-fat force-pushed the metal-library-cache branch 2 times, most recently from 6b3a511 to 0569909 Compare March 9, 2025 12:27
@BB-fat BB-fat marked this pull request as ready for review March 9, 2025 12:29
@BB-fat
Copy link
Contributor Author

BB-fat commented Mar 10, 2025

During testing, I found an objc double-release issue, I am trying to fix it.

@BB-fat BB-fat force-pushed the metal-library-cache branch from 0569909 to 70432c7 Compare March 10, 2025 05:33
@BB-fat
Copy link
Contributor Author

BB-fat commented Mar 10, 2025

@ggerganov Please review when convenient.

@ggerganov ggerganov merged commit 6ab2e47 into ggml-org:master Mar 11, 2025
47 checks passed
@BB-fat BB-fat deleted the metal-library-cache branch March 12, 2025 02:19
ishaangandhi pushed a commit to ishaangandhi/llama.cpp that referenced this pull request Mar 12, 2025
jpohhhh pushed a commit to Telosnex/llama.cpp that referenced this pull request Mar 14, 2025
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Metal] Context init optimization opportunity: metal library is compiled for every llama context
2 participants