You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -275,7 +276,7 @@ case visit our [customization guide](docs/model_customization.md).
275
276
276
277
To run in a python enviroment, use the generate subcommand like before, but include the dso file.
277
278
278
-
```
279
+
```bash
279
280
python3 torchchat.py generate llama3.1 --dso-path exportedModels/llama3.1.so --prompt "Hello my name is"
280
281
```
281
282
**Note:** Depending on which accelerator is used to generate the .dso file, the command may need the device specified: `--device (cuda | cpu)`.
@@ -288,9 +289,14 @@ To run in a C++ enviroment, we need to build the runner binary.
288
289
torchchat/utils/scripts/build_native.sh aoti
289
290
```
290
291
291
-
Then run the compiled executable, with the exported DSO from earlier.
292
+
To compile the AOTI generated artifacts into a `.so`:
293
+
```bash
294
+
make -C exportedModels/llama3_1_artifacts
295
+
```
296
+
297
+
Then run the compiled executable, with the compiled DSO.
292
298
```bash
293
-
cmake-out/aoti_run exportedModels/llama3.1.so -z `python3 torchchat.py where llama3.1`/tokenizer.model -l 3 -i "Once upon a time"
299
+
cmake-out/aoti_run exportedModels/llama3_1_artifacts/llama3_1_artifacts.so -z `python3 torchchat.py where llama3.1`/tokenizer.model -l 3 -i "Once upon a time"
294
300
```
295
301
**Note:** Depending on which accelerator is used to generate the .dso file, the runner may need the device specified: `-d (CUDA | CPU)`.
0 commit comments