Skip to content

feat: conditionally toggle chat on invocations route #1454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 22, 2024

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Jan 18, 2024

This PR adds support for reading the OAI_ENABLED env var which will changes the function called when the /invocations is called.

If OAI_ENABLED=true the chat_completions method is used otherwise it defaults to compat_generate.

example running the router

OAI_ENABLED=true \
  cargo run -- \
  --tokenizer-name mistralai/Mistral-7B-Instruct-v0.2

example request

curl localhost:3000/invocations \
    -X POST \
    -d '{ "model": "tgi", "messages": [ { "role": "user", "content": "What is the IP address of the Google DNS servers?" } ], "stream": false, "max_tokens": 20, "logprobs": true, "seed": 0 }' \
    -H 'Content-Type: application/json' | jq 

**please let me know if any naming changes are needed or if any other routes need similar functionality.

@drbh drbh requested a review from Narsil January 22, 2024 14:20
Copy link
Collaborator

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I just think we should document the behavior somewhere.

@philschmid any idea of where that will be best documented for SM users ?

@philschmid
Copy link
Contributor

LGTM.

I just think we should document the behavior somewhere.

@philschmid any idea of where that will be best documented for SM users ?

Good question. Should add a dedicated documentation page in TGI on how to deploy to sagemaker? Where we then can include this?

@Narsil
Copy link
Collaborator

Narsil commented Jan 22, 2024

I'm fine with it @drbh Could you create a new page in the doc ? This can be done in a follow-up PR.

@philschmid
Copy link
Contributor

I'm fine with it @drbh Could you create a new page in the doc ? This can be done in a follow-up PR.

Happy to help here! It would mostly base on this blog posts:

@drbh drbh merged commit 98e5faf into main Jan 22, 2024
@drbh drbh deleted the invocations-toggle-chat-via-env branch January 22, 2024 15:29
kdamaszk pushed a commit to kdamaszk/tgi-gaudi that referenced this pull request Apr 29, 2024
This PR adds support for reading the `OAI_ENABLED` env var which will
changes the function called when the `/invocations` is called.

If `OAI_ENABLED=true` the `chat_completions` method is used otherwise it
defaults to `compat_generate`.

example running the router
```bash
OAI_ENABLED=true \
  cargo run -- \
  --tokenizer-name mistralai/Mistral-7B-Instruct-v0.2
```

example request
```bash
curl localhost:3000/invocations \
    -X POST \
    -d '{ "model": "tgi", "messages": [ { "role": "user", "content": "What is the IP address of the Google DNS servers?" } ], "stream": false, "max_tokens": 20, "logprobs": true, "seed": 0 }' \
    -H 'Content-Type: application/json' | jq 
```

**please let me know if any naming changes are needed or if any other
routes need similar functionality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants