Skip to content

Disable KV cache shifting automatically for unsupported models #11053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 3, 2025

Conversation

MollySophia
Copy link
Collaborator

Disable KV cache shifting automatically for unsupported models instead of exiting directly.

This makes it easier for models that doesn't support KV cache shifting.
Currently in arg.cpp --no-context-shift is only enabled in LLAMA_EXAMPLE_MAIN, LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_IMATRIX, LLAMA_EXAMPLE_PERPLEXITY. As a result, for example, using llama-parallel with recurrent models will fail with message indicating that context-shift is not supported. But --no-context-shift isn't an available parameter for llama-parallel.

instead of exiting directly

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
@ggerganov ggerganov merged commit 4b0c638 into ggml-org:master Jan 3, 2025
47 checks passed
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…ls (ggml-org#11053)

* Disable KV cache shifting automatically for unsupported models

instead of exiting directly

Signed-off-by: Molly Sophia <[email protected]>

* Update common/common.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
…ls (ggml-org#11053)

* Disable KV cache shifting automatically for unsupported models

instead of exiting directly

Signed-off-by: Molly Sophia <[email protected]>

* Update common/common.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
…ls (ggml-org#11053)

* Disable KV cache shifting automatically for unsupported models

instead of exiting directly

Signed-off-by: Molly Sophia <[email protected]>

* Update common/common.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants