Disable decoder_input_details
on OpenAI-compatible chat streaming, pass temp and top-k from API
#1470
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes some minor tweaks to the new OpenAI-compatible chat endpoint #1427 in
GenerateParameters
:decoder_input_details
when streaming is enabled. This was causing all streaming chat requests to fail before, sincedecoder_input_details
==true is not enabled when streaming tokens.temperature
andtop_p
hyperparameters from the API request toGenerateParameters
Testing
Should work correctly. Currently, most recent release from
main
returns error:It's my first time contributing to this project, so I could be missing something. Would especially appreciate @drbh's eyes on this one