-
Notifications
You must be signed in to change notification settings - Fork 250
OpenAI API and UI using streamlit #875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/875
Note: Links to docs will display an error until the docs builds have been completed. ❌ 20 New Failures, 3 Cancelled Jobs, 4 Unrelated FailuresAs of commit ed1c939 with merge base e1fb003 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
For now, this is pretty hacky. To run the Streamlit app:
Current Issues:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before getting into this review, please split this into smaller PRs. There are a lot of different things going on here, and if we needed to revert this it would have a large blast radius. Independent topics I see:
- Setting up ISSUE_TEMPLATE
- Fixing formatting and capitalization (whitespace, ExecuTorch, etc)
- Adding the API layer
- Adding the browser layer on top of the API layer
- The changes under distributed/
- Maybe others
I usually use ghstack to make a stack of PRs; basically, separate/squash things into one git commit per PR, then run ghstack to push/update the whole stack.
Though some of these are totally independent, and don't need to be stacked: they'd be better off as non-stacked PRs, because then we could land them on their own time.
Generate is broken. |
README.md
Outdated
@@ -118,6 +118,34 @@ python3 torchchat.py generate llama3 --prompt "write me a story about a boy and | |||
|
|||
For more information run `python3 torchchat.py generate --help` | |||
|
|||
The `Generator` class can also be imported into a Python program to generate responses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit overkill for the readme. We want to keep the readme as clean and short as possible. I'd return the necessary info for generate via --help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shortened this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything else in this readme is about the commandline. Torchchat isn't meant to be a library that other people can import and use, so we shouldn't talk about the internals of files in this README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this section from the readme
5d3d896
to
c6bd796
Compare
README.md
Outdated
@@ -118,6 +118,34 @@ python3 torchchat.py generate llama3 --prompt "write me a story about a boy and | |||
|
|||
For more information run `python3 torchchat.py generate --help` | |||
|
|||
The `Generator` class can also be imported into a Python program to generate responses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything else in this readme is about the commandline. Torchchat isn't meant to be a library that other people can import and use, so we shouldn't talk about the internals of files in this README.
utils/utils.py
Outdated
from subprocess import check_output | ||
|
||
import torch | ||
def get_device_info(name: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a pydoc string. What sort of names does it expect as input? What kinds of strings can it return?
generate.py
Outdated
profile: Optional[Path], | ||
quantize, | ||
draft_quantize, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a doc comment describing all of the arguments.
utils/utils.py
Outdated
@@ -0,0 +1,24 @@ | |||
import platform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file needs a copyright header
ba0a1a3
to
5518efc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at my other pending comments, too
eval.py
Outdated
return encoded | ||
|
||
def tok_decode(self, tokens): | ||
decoded = self._tokenizer.decode(tokens) | ||
return decoded | ||
|
||
def _model_call(self, inps): | ||
# TODO: make batches work | ||
# TODO: make bat˚≈˚ches work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray extra characters
eval.py
Outdated
@@ -85,11 +84,17 @@ def __init__( | |||
self, | |||
model: Transformer, | |||
tokenizer, | |||
model_forward: callable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callable
isn't a type, it's a built-in function that returns true if something is a callable. These should be annotated using from typing import Callable
, model_forward: Callable,
instead. Same for other uses of lowercase callable
.
generate.py
Outdated
""" | ||
Generates text samples based on a pre-trained Transformer model and tokenizer. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc strings need to be inside the thing that they document
class Generator:
"""Generates text samples based on a pre-trained Transformer model and tokenizer."""
README.md
Outdated
@@ -118,6 +118,34 @@ python3 torchchat.py generate llama3 --prompt "write me a story about a boy and | |||
|
|||
For more information run `python3 torchchat.py generate --help` | |||
|
|||
The `Generator` class can also be imported into a Python program to generate responses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this section from the readme
api/api.py
Outdated
|
||
|
||
@dataclass | ||
class AbstractMessageType(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docstrings on these classes. Especially this one, which represents some kind of base class.
For each class, if there's a URL that describes the type and the fields in it, please point to that URL. As-is, it's not clear what most of the fields in this file mean, what their valid values are, etc.
api/api.py
Outdated
|
||
|
||
@dataclass | ||
class AbstractMessageType(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do users need to see this type? Or could it be _AbstractMessageType
to hide it?
api/api.py
Outdated
|
||
|
||
@dataclass | ||
class ToolMessage(AbstractMessageType): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is ToolMessage
not ToolMessageType
, which is inconsistent with the other subclasses of AbstractMessageType.
tbh I prefer the version without Type
, since these are all types; could the other subclasses drop the Type
too, or is this part of the API? Either way, it'd be better to make the names consistent if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generate.py commit is looking really good! Thanks for all of the changes so far.
The final blocking issue is the "sys.path.append" thing.
eval.py
Outdated
@@ -85,11 +84,17 @@ def __init__( | |||
self, | |||
model: Transformer, | |||
tokenizer, | |||
model_forward: Callable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the code checks to see if this is None, this should be
model_forward: Optional[Callable] = None,
Could leave out the = None
part, but it makes things easier for callers if they don't want to provide model_forward. And it preserves BC for this API.
generate.py
Outdated
builder_args (BuilderArgs): A BuilderArgs object defining the model configuration | ||
speculative_builder_args (BuilderArgs): A BuilderArgs object defining the speculative model configuration for speculative decode | ||
tokenizer_args (TokenizerArgs): A TokenizerArgs object defining the tokenizer configuration for both the model and speculative model | ||
generator_args (GeneratorArgs): A GeneratorArgs object controlling the generation parameters | ||
profile (Path): A Path object pointing to a directory where the profiling results will be stored, if enabled. | ||
quantize (bool): Whether to quantize the model. | ||
draft_quantize (bool): Whether to quantize the draft model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc comments shouldn't mention the types; the params themselves are (or should be) type-annotated instead, and the reader can look at those. This reduces the possibility of the doc getting out of sync with the code.
generate.py
Outdated
quantize, | ||
draft_quantize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the docstring, these should both be annotated as bool
generate.py
Outdated
# support running without installing as a package | ||
wd = Path(__file__).parent.parent.resolve() | ||
sys.path.append(str(wd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder about this. This should not happen inside the class.
generate.py
Outdated
This model is not known to support the chat function | ||
and may produce nonsensical or false output. | ||
******************************************************* | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inconsistent indentation
print(textwrap.dedent(
"""
*******************************************************
This model is not known to support the chat function
and may produce nonsensical or false output.
*******************************************************
"""
))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments and docstrings on api.py, they really help a lot. Please take a look at the other unresolved comments I left on this file earlier.
api/api.py
Outdated
""" | ||
Message classes and associated objects - see the types of Messages under "Create Chat Completion >>> Request body >>> messages" | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this doesn't annotate anything specific, it should be a #
comment block. The first docstring on line 11 will be associated with the file/module, but this second string will be ignored by pydoc.
api/api.py
Outdated
|
||
@dataclass | ||
class AbstractMessage(ABC): | ||
"Base class with common parameters for Message types. All Messages require a role and can have a content string" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always use """
for docstrings.
Also, docstrings should always start with a short (<80col) summary on the first line with the quotes, ending with a period. Any additional details should be after a blank line, and use full sentences ending with periods.
See https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings for full details.
"""Base class with common parameters for Message types.
All Messages require a role and can have a content string.
"""
api/api.py
Outdated
|
||
@dataclass | ||
class AbstractMessage(ABC): | ||
"Base class with common parameters for Message types. All Messages require a role and can have a content string" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are valid/expected role and content strings? How would an author choose what to put in those fields?
api/api.py
Outdated
tool_calls: Optional[List[ToolCall]] = None | ||
|
||
|
||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be a #
comment. (but thank you for adding these block comments)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The browser commit looks good.
c830c61
to
ea88641
Compare
utils/device_info.py
Outdated
import torch | ||
|
||
|
||
def get_device_info(name: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring. What format of string does this expect? What format of string does it return?
18f43bf
to
40c4676
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple minor things, but looks great!
Next step is to split this into three PRs using ghstack
generate.py
Outdated
builder_args: A BuilderArgs object defining the model configuration | ||
speculative_builder_args: A BuilderArgs object defining the speculative model configuration for speculative decode | ||
tokenizer_args: A TokenizerArgs object defining the tokenizer configuration for both the model and speculative model | ||
generator_args: A GeneratorArgs object controlling the generation parameters | ||
profile: A Path object pointing to a directory where the profiling results will be stored, if enabled. | ||
quantize: Whether to quantize the model. | ||
draft_quantize: Whether to quantize the draft model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok, but in the future don't mention the types of things ("A XYZ object") since it's already provided by the type annotation on the function definition.
generate.py
Outdated
tokenizer_args: A TokenizerArgs object defining the tokenizer configuration for both the model and speculative model | ||
generator_args: A GeneratorArgs object controlling the generation parameters | ||
profile: A Path object pointing to a directory where the profiling results will be stored, if enabled. | ||
quantize: Whether to quantize the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok, but when documenting booleans it's better to say like "If true, quantizes the model." The "whether" phrasing can be ambiguous, especially when the arg names are less clear than these.
utils/device_info.py
Outdated
"""Returns a human-readable description of the hardware based on a torch.device.type | ||
|
||
Args: | ||
device: A torch.device.type string, such as CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU
isn't a valid input; it only accepts "cpu". I'd say something like
device: A torch.device.type string: one of {"cpu", "cuda"}.
or just
device: A torch.device.type string.
api/api.py
Outdated
class _AbstractMessage(ABC): | ||
"""Base class with common parameters for Message types. | ||
|
||
All Messages require a role and can have a content string. Each Message type originates from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still doesn't describe the expected format of these strings, especially role
.
In general, saying that something is a string isn't enough for someone to know how to use it. If the format of the string matters, then the user needs to know what the valid or expected values are. If the string can be anything, then they should know that, but also know how the string is being used.
else self.model.config.max_seq_length | ||
) | ||
|
||
def completion(self, completion_request: CompletionRequest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems to be the primary way of interacting with this class, so it should have a docstring.
And the docstring should have a "Yields:" section (instead of a "Returns:" section) describing the return value.
else self.model.config.max_seq_length | ||
) | ||
|
||
def completion(self, completion_request: CompletionRequest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a return/yield type if possible. https://stackoverflow.com/a/38423388 talks about annotating generator functions.
c60904b
to
d49a451
Compare
Description
Encapsulate the functions in generate.py within a class, allowing us to create child classes that override functions where needed.
Test plan:
Run generate
Run eval:
Run generate --compiled