Skip to content

fix(cli): allow passing n_ctx=0 to openAI API server CLI arguments #1093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2024

Conversation

K-Mistele
Copy link
Contributor

What?

This PR updates the OpenAI API server's command line arguments to allow passing --n_ctx 0 as a command-line argument. Currently, --n_ctx parameter has a minimum of 1: in llama_cpp/server/settings.py you can see the following context CLI parameter configuration: n_ctx: int = Field(default=2048, ge=1, description="The context size.")

Why?

As of [email protected], per #1015 by @DanieleMorotti, passing n_ctx=0 to the LLama class in llama_cpp/llama.py automatically sets the n_ctx to the model's n_ctx_train paramter from KV, and also updates the model's n_batch to min(n_ctx, n_batch). This is intentional to allow llama-cpp-python to infer the context size from the GGUF model file's KV parameters.

However, when this change was made, the OpenAI API server's CLI argument configuration was not updated, so the minimum value for the option remained at 1 - making the patch in #1015 unavailable to users of the OpenAI API serve.r

@abetlen
Copy link
Owner

abetlen commented Jan 16, 2024

@K-Mistele thank you!

@abetlen abetlen merged commit 9c36688 into abetlen:main Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants