You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-11Lines changed: 20 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -293,13 +293,18 @@ Use the "Max Response Tokens" slider to limit the maximum number of tokens gener
293
293
## Desktop/Server Execution
294
294
295
295
### AOTI (AOT Inductor)
296
-
[AOTI](https://pytorch.org/blog/pytorch2-2/) compiles models before execution for faster inference. The process creates a [DSO](https://en.wikipedia.org/wiki/Shared_library) model (represented by a file with extension `.so`)
297
-
that is then loaded for inference. This can be done with both Python and C++ environments.
296
+
[AOTI](https://pytorch.org/blog/pytorch2-2/) compiles models before execution
297
+
for faster inference. The process creates a zipped PT2 file containing all the
298
+
artifacts generated by AOTInductor, and a
299
+
[.so](https://en.wikipedia.org/wiki/Shared_library) file with the runnable
300
+
contents that is then loaded for inference. This can be done with both Python
301
+
and C++ enviroments.
298
302
299
303
The following example exports and executes the Llama3.1 8B Instruct
300
304
model. The first command compiles and performs the actual export.
@@ -311,12 +316,11 @@ case visit our [customization guide](docs/model_customization.md).
311
316
312
317
### Run in a Python Environment
313
318
314
-
To run in a python environment, use the generate subcommand like before, but include the dso file.
319
+
To run in a python enviroment, use the generate subcommand like before, but include the pt2 file.
315
320
321
+
```bash
322
+
python3 torchchat.py generate llama3.1 --aoti-package-path exportedModels/llama3_1_artifacts.pt2 --prompt "Hello my name is"
316
323
```
317
-
python3 torchchat.py generate llama3.1 --dso-path exportedModels/llama3.1.so --prompt "Hello my name is"
318
-
```
319
-
**Note:** Depending on which accelerator is used to generate the .dso file, the command may need the device specified: `--device (cuda | cpu)`.
320
324
321
325
322
326
### Run using our C++ Runner
@@ -326,11 +330,10 @@ To run in a C++ enviroment, we need to build the runner binary.
326
330
torchchat/utils/scripts/build_native.sh aoti
327
331
```
328
332
329
-
Then run the compiled executable, with the exported DSO from earlier.
333
+
Then run the compiled executable, with the pt2.
330
334
```bash
331
-
cmake-out/aoti_run exportedModels/llama3.1.so -z `python3 torchchat.py where llama3.1`/tokenizer.model -l 3 -i "Once upon a time"
335
+
cmake-out/aoti_run exportedModels/llama3_1_artifacts.pt2 -z `python3 torchchat.py where llama3.1`/tokenizer.model -l 3 -i "Once upon a time"
332
336
```
333
-
**Note:** Depending on which accelerator is used to generate the .dso file, the runner may need the device specified: `-d (CUDA | CPU)`.
334
337
335
338
## Mobile Execution
336
339
@@ -570,6 +573,12 @@ We provide
570
573
571
574
We really value our community and the contributions made by our wonderful users. We'll use this section to call out some of these contributions! If you'd like to help out as well, please see the [CONTRIBUTING](CONTRIBUTING.md) guide.
572
575
576
+
To connect with us and other community members, we invite you to join our Slack community by filling out this [form](https://docs.google.com/forms/d/e/1FAIpQLSeADnUNW36fjKjYzyHDOzEB_abKQE9b6gqqW9NXse6O0MWh0A/viewform). Once you've joined, you can:
577
+
* Head to the `#torchchat-general` channel for general questions, discussion, and community support.
578
+
* Join the `#torchchat-contribution` channel if you're interested in contributing directly to project development.
579
+
580
+
Looking forward to discussing with you about torchchat future!
581
+
573
582
## Troubleshooting
574
583
575
584
A section of commonly encountered setup errors/exceptions. If this section doesn't contain your situation, check the GitHub [issues](https://github.com/pytorch/torchchat/issues)
0 commit comments