Skip to content

feat: concat the adapter id to the model id in chat response #2779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 25, 2024

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Nov 25, 2024

This PR concatenates the adapter id to the model id when an adapter id is specified in the chat request.

the follwing request

curl http: //localhost:3000/v1/chat/completions \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "google-cloud-partnership/gemma-2-2b-it-lora-jap-en",
    "messages": [
        {
            "content": "たくさんお金を稼ぎましたか? Translate: ",
            "role": "user"
        }
    ],
    "max_tokens": 10,
    "stream": false
}' | jq

currently returns

{
    "object": "chat.completion",
    "id": "",
    "created": 1732545871,
    "model": "google/gemma-2-2b-it",
    "system_fingerprint": "2.4.2-dev0-native",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Have you made a fortune?"
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 7,
        "total_tokens": 26
    }
}

and with the changes returns an updated model id

{
    "object": "chat.completion",
    "id": "",
    "created": 1732545871,
    "model": "google-cloud-partnership/gemma-2-2b-it-lora-jap-en",
    "system_fingerprint": "2.4.2-dev0-native",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Have you made a fortune?"
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 7,
        "total_tokens": 26
    }
}

OlivierDehaene
OlivierDehaene previously approved these changes Nov 25, 2024
// extract model id from request if specified
let model_id = match model.as_deref() {
Some("tgi") | None => info.model_id.clone(),
Some(m_id) => format!("{}+{}", info.model_id, m_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it should only be google-cloud-partnership/gemma-2-2b-it-lora-jap-en

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philschmid that case is much more concise, however it leaves out the current model id which could obscure information from the caller. Since the generation is a combination of the model and the adapter it seems to reason to add both ids.

I'm happy to change the id to just the adapter if that makes the most sense, however for the reasons above I think its best to include both

Copy link
Contributor

@philschmid philschmid Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But its not what openai or vllm does. They return the adapter or the "real" model. google-cloud-partnership/gemma-2-2b-it-lora-jap-en defines the base model in the config.json so i would only go with the adapter id

I would try to stay in sync with what others are doing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah thats a great reason to only use the adapter id, just updated in the latest commit

@drbh
Copy link
Collaborator Author

drbh commented Nov 25, 2024

merging as the failing CI is unrelated - and issues have been resolved/approved before the string change

@drbh drbh merged commit c637d68 into main Nov 25, 2024
10 of 12 checks passed
@drbh drbh deleted the include-adapter-id-with-model-id-in-repsonse branch November 25, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants