Skip to content

Merge Openai api version route to main #1021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 14, 2024
Merged

Merge Openai api version route to main #1021

merged 4 commits into from
Aug 14, 2024

Conversation

vmpuri
Copy link
Contributor

@vmpuri vmpuri commented Aug 8, 2024

#1016 mistakenly got merged into this development branch instead of main.

Copy link

pytorch-bot bot commented Aug 8, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1021

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit 716485c with merge base f384d4f (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 8, 2024
@vmpuri vmpuri marked this pull request as ready for review August 8, 2024 02:22
@vmpuri vmpuri requested a review from Jack-Khuu August 8, 2024 18:17
README.md Outdated
@@ -200,6 +200,10 @@ streamlit run torchchat.py -- browser llama3.1
<details>
<summary>This mode gives a REST API that matches the OpenAI API spec for interacting with a model</summary>

The server follows the [OpenAI API specification](https://platform.openai.com/docs/api-reference/chat) for chat completions.
Since this feature is under active development, it's possible not every parameter is consumed. See api/api.py for details on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Since this feature is under active development, it's possible not every parameter is consumed. See api/api.py for details on
Since this feature is under active development, not every parameter is consumed. See api/api.py for details on

@@ -270,7 +271,13 @@ def chunked_completion(self, completion_request: CompletionRequest):
)
generator_args = GeneratorArgs(
completion_request.messages[-1].get("content"),
max_new_tokens=(
int(completion_request.max_tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request object was being populated from JSON, which automatically means the value for every field is str despite the dataclass type hints.

These casts were to get things working, but yes - this is a bad solution and we should have better type enforcement here (and possibly repo-wide)

Based on what I can find, we can resolve this by casting the types at the object's init time for each field (kinda tedious) or by using something like pydantic (yet another package to manage).

Should we attack this in another issue/PR or try to resolve it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting at init time makes sense, since downstream users shouldn't need to think about it

We can make it a separate PR or push the casting to init just for max_tokens and temperature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #1025 to track this, will merge this PR for now.

encoded_prompt=encoded,
temperature=float(completion_request.temperature),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous re: typecasting for dataclasses.

@vmpuri vmpuri force-pushed the openai_api_version_route branch from 01bc092 to d1a0da8 Compare August 12, 2024 06:47
@byjlw byjlw merged commit 3ce1cef into main Aug 14, 2024
46 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants