Fix index in ChatCompletionChunk #1648
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix a small inconsistency compared the OpenAI's chat-completion behavior (introduced in #1427 cc @drbh). When using
stream=True
, each chunk has anindex
value inChatCompletionChoice
. This index is not meant to be the index of the generated token but the index of the choice, which is always 0 (since TGI always return a single choice).See https://platform.openai.com/docs/api-reference/chat/object:
So instead of
if should return
EDIT: I also edited ToolCall.index to be always
0
(instead of the generated token index) but for this one I'm actually unsure. It might be the index of the tool in the array of tools? OpenAI's documentation doesn't provide any information about it:I also noticed that in OpenAI's example, the last chunk doesn't have a delta and is the only one that has a
finish_reason
returning. TGI is slightly different since the last chunk has both the last delta (i.e. the last generated token) + the finish reason. I don't think this is worth fixing since it is not a requirement according to the docs/specs (at least not that I know of).