Bug fixed with n_ctx=0 #1015

DanieleMorotti · 2023-12-15T11:28:54Z

If the n_ctx parameter is set to 0 the function should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with n_batch.

I know that the code is not the best one, but in order to get the model information about the context I needed to add it after the creation of the model instance at line 923 of the file llama.py.

self._model = _LlamaModel(
    path_model=self.model_path, params=self.model_params, verbose=self.verbose
)

Unfortunately, different objects were already initialized, therefore in the fix I had to change the n_ctx, self.n_batch, self.context_params.n_ctx and self.context_params.n_batch variables even if they already had a value.

Tell me if you find a smarter or more elegant solution to change the code and i will implement it.

This change should also fix #988.

Thank you

If the n_ctx is set to 0 the code should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with 'n_batch'.

DanieleMorotti · 2023-12-15T13:35:18Z

Another approach can be to use the code in the llama.cpp repo to read the metadata from the gguf file before loading the model. In the json format the context length is related to the key llama.context_length.

abetlen · 2023-12-16T22:30:42Z

@DanieleMorotti does the metadata value differ from n_ctx_train? I think this approach is good though, maybe worth changing to the default in a future major release.

DanieleMorotti · 2023-12-17T08:46:28Z

I think n_ctx_train is the same value that we can read from the metadata. Tell me if I should try to implement the other approach, we would need to add gguf-py directory of llama.cpp in this repo and call the function that returns the json with the information.
Thanks

…l n_ctx_train field per abetlen#1015

…l n_ctx_train field per #1015 (#1093)

Bug fixed with n_ctx=0

402ea21

If the n_ctx is set to 0 the code should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with 'n_batch'.

abetlen merged commit f1c631d into abetlen:main Dec 16, 2023

K-Mistele added a commit to K-Mistele/llama-cpp-python that referenced this pull request Jan 16, 2024

fix(cli): allow passing n_ctx=0 to openAI API server args to use mode…

269327c

…l n_ctx_train field per abetlen#1015

K-Mistele mentioned this pull request Jan 16, 2024

fix(cli): allow passing n_ctx=0 to openAI API server CLI arguments #1093

Merged

abetlen pushed a commit that referenced this pull request Jan 16, 2024

fix(cli): allow passing n_ctx=0 to openAI API server args to use mode…

9c36688

…l n_ctx_train field per #1015 (#1093)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug fixed with n_ctx=0 #1015

Bug fixed with n_ctx=0 #1015

Uh oh!

DanieleMorotti commented Dec 15, 2023

Uh oh!

DanieleMorotti commented Dec 15, 2023

Uh oh!

abetlen commented Dec 16, 2023

Uh oh!

DanieleMorotti commented Dec 17, 2023

Uh oh!

Uh oh!

Bug fixed with n_ctx=0 #1015

Bug fixed with n_ctx=0 #1015

Uh oh!

Conversation

DanieleMorotti commented Dec 15, 2023

Uh oh!

DanieleMorotti commented Dec 15, 2023

Uh oh!

abetlen commented Dec 16, 2023

Uh oh!

DanieleMorotti commented Dec 17, 2023

Uh oh!

Uh oh!