Skip to content

Bug fixed with n_ctx=0 #1015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 16, 2023
Merged

Bug fixed with n_ctx=0 #1015

merged 1 commit into from
Dec 16, 2023

Conversation

DanieleMorotti
Copy link
Contributor

If the n_ctx parameter is set to 0 the function should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with n_batch.

I know that the code is not the best one, but in order to get the model information about the context I needed to add it after the creation of the model instance at line 923 of the file llama.py.

self._model = _LlamaModel(
    path_model=self.model_path, params=self.model_params, verbose=self.verbose
)

Unfortunately, different objects were already initialized, therefore in the fix I had to change the n_ctx, self.n_batch, self.context_params.n_ctx and self.context_params.n_batch variables even if they already had a value.

Tell me if you find a smarter or more elegant solution to change the code and i will implement it.

This change should also fix #988.

Thank you

If the n_ctx is set to 0 the code should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with 'n_batch'.
@DanieleMorotti
Copy link
Contributor Author

Another approach can be to use the code in the llama.cpp repo to read the metadata from the gguf file before loading the model. In the json format the context length is related to the key llama.context_length.

@abetlen
Copy link
Owner

abetlen commented Dec 16, 2023

@DanieleMorotti does the metadata value differ from n_ctx_train? I think this approach is good though, maybe worth changing to the default in a future major release.

@abetlen abetlen merged commit f1c631d into abetlen:main Dec 16, 2023
@DanieleMorotti
Copy link
Contributor Author

I think n_ctx_train is the same value that we can read from the metadata. Tell me if I should try to implement the other approach, we would need to add gguf-py directory of llama.cpp in this repo and call the function that returns the json with the information.
Thanks

K-Mistele added a commit to K-Mistele/llama-cpp-python that referenced this pull request Jan 16, 2024
abetlen pushed a commit that referenced this pull request Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Llama(n_ctx = 0) doesn't work
2 participants