You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-2Lines changed: 6 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -200,6 +200,10 @@ streamlit run torchchat.py -- browser llama3.1
200
200
<details>
201
201
<summary>This mode gives a REST API that matches the OpenAI API spec for interacting with a model</summary>
202
202
203
+
The server follows the [OpenAI API specification](https://platform.openai.com/docs/api-reference/chat) for chat completions.
204
+
Since this feature is under active development, not every parameter is consumed. See api/api.py for details on
205
+
which request parameters are implemented. If you encounter any issues, please comment on the [tracking Github issue](https://github.com/pytorch/torchchat/issues/973).
206
+
203
207
To test out the REST API, **you'll need 2 terminals**: one to host the server, and one to send the request.
204
208
205
209
In one terminal, start the server
@@ -213,8 +217,7 @@ python3 torchchat.py server llama3.1
213
217
214
218
In another terminal, query the server using `curl`. Depending on the model configuration, this query might take a few minutes to respond.
215
219
216
-
Setting `stream` to "true" in the request emits a response in chunks. Currently, this response
217
-
is plaintext and will not be formatted to the OpenAI API specification. If `stream` is unset or not "true", then the client will await the full response from the server.
220
+
Setting `stream` to "true" in the request emits a response in chunks. If `stream` is unset or not "true", then the client will await the full response from the server.
0 commit comments