Fix index in ChatCompletionChunk #1648

Wauplin · 2024-03-15T13:21:54Z

Fix a small inconsistency compared the OpenAI's chat-completion behavior (introduced in #1427 cc @drbh). When using stream=True, each chunk has an index value in ChatCompletionChoice. This index is not meant to be the index of the generated token but the index of the choice, which is always 0 (since TGI always return a single choice).

See https://platform.openai.com/docs/api-reference/chat/object:

index integer
The index of the choice in the list of choices.

So instead of

data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":1,"delta":{"role":"assistant","content":"I"},"logprobs":null,"finish_reason":null}]}
data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":2,"delta":{"role":"assistant","content":"'"},"logprobs":null,"finish_reason":null}]}
data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":3,"delta":{"role":"assistant","content":"m"},"logprobs":null,"finish_reason":"length"}]}

if should return

data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"logprobs":null,"finish_reason":null}]}
data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"logprobs":null,"finish_reason":null}]}
data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"logprobs":null,"finish_reason":"length"}]}

EDIT: I also edited ToolCall.index to be always 0 (instead of the generated token index) but for this one I'm actually unsure. It might be the index of the tool in the array of tools? OpenAI's documentation doesn't provide any information about it:

index integer

I also noticed that in OpenAI's example, the last chunk doesn't have a delta and is the only one that has a finish_reason returning. TGI is slightly different since the last chunk has both the last delta (i.e. the last generated token) + the finish reason. I don't think this is worth fixing since it is not a requirement according to the docs/specs (at least not that I know of).

drbh

@Wauplin thanks for the fix and references 🙏

LGTM

@drbh

Fix a small inconsistency compared the OpenAI's chat-completion behavior (introduced in huggingface#1427 cc @drbh). When using `stream=True`, each chunk has an `index` value in `ChatCompletionChoice`. This index is not meant to be the index of the generated token but the index of the choice, which is always 0 (since TGI always return a single choice). See https://platform.openai.com/docs/api-reference/chat/object: > index _integer_ > The index of the choice in the list of choices. --- So instead of ```js data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":1,"delta":{"role":"assistant","content":"I"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":2,"delta":{"role":"assistant","content":"'"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":3,"delta":{"role":"assistant","content":"m"},"logprobs":null,"finish_reason":"length"}]} ``` if should return ```js data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"logprobs":null,"finish_reason":"length"}]} ``` **EDIT:** I also edited ToolCall.index to be always `0` (instead of the generated token index) but for this one I'm actually unsure. It might be the index of the tool in the array of tools? OpenAI's documentation doesn't provide any information about it: > index _integer_ --- I also noticed that in OpenAI's example, the last chunk doesn't have a delta and is the only one that has a `finish_reason` returning. TGI is slightly different since the last chunk has both the last delta (i.e. the last generated token) + the finish reason. I don't think this is worth fixing since it is not a requirement according to the docs/specs (at least not that I know of).

Fix index in chat completion chunk

9a12e14

Wauplin requested a review from drbh March 15, 2024 13:25

fix tool call as well

3686f9c

drbh approved these changes Mar 16, 2024

View reviewed changes

drbh merged commit 23fba67 into main Mar 16, 2024

drbh deleted the fix-index-in-chat-completion-chunks branch March 16, 2024 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix index in ChatCompletionChunk #1648

Fix index in ChatCompletionChunk #1648

Uh oh!

Wauplin commented Mar 15, 2024 •

edited

Loading

Uh oh!

drbh left a comment •

edited

Loading

Uh oh!

Uh oh!

Fix index in ChatCompletionChunk #1648

Fix index in ChatCompletionChunk #1648

Uh oh!

Conversation

Wauplin commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drbh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Wauplin commented Mar 15, 2024 •

edited

Loading

drbh left a comment •

edited

Loading