Skip to content

Commit 2970674

Browse files
mikekgfbmalfet
authored andcommitted
T (#341)
1 parent 6169784 commit 2970674

File tree

4 files changed

+14
-14
lines changed

4 files changed

+14
-14
lines changed

docs/Android.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Check out the [tutorial on how to build an Android app running your
44
PyTorch models with
5-
Executorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html),
5+
ExecuTorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html),
66
and give your torchat models a spin.
77

88
![Screenshot](https://pytorch.org/executorch/main/_static/img/android_llama_app.png "Android app running Llama model")

docs/GGUF.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ export GGUF_PTE_PATH=/tmp/gguf_model.pte
2727
python torchchat.py generate --gguf-path ${GGUF_MODEL_PATH} --tokenizer-path ${GGUF_TOKENIZER_PATH} --temperature 0 --prompt "In a faraway land" --max-new-tokens 20
2828
```
2929

30-
### Executorch export + generate
30+
### ExecuTorch export + generate
3131
```
3232
# Convert the model for use
3333
python torchchat.py export --gguf-path ${GGUF_MODEL_PATH} --output-pte-path ${GGUF_PTE_PATH}

docs/MISC.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ We use several variables in this example, which may be set as a preparatory step
173173
by replacing the modelname with the name of the tokenizer model which is expected to be named `tokenizer.model`.
174174

175175
* `MODEL_OUT` is a location for outputs from export for server/desktop and/or mobile/edge execution. We store exported
176-
artifacts here, with extensions .pte for Executorch models, .so for AOT Inductor generated models, and .bin for tokenizers
176+
artifacts here, with extensions .pte for ExecuTorch models, .so for AOT Inductor generated models, and .bin for tokenizers
177177
prepared for use with the C++ tokenizers user by `runner-aoti` and `runner-et`.
178178

179179
You can set these variables as follows for the exemplary model15M model from Andrej Karpathy's tinyllamas model family:
@@ -184,9 +184,9 @@ MODEL_PATH=${MODEL_OUT}/stories15M.pt
184184
MODEL_OUT=~/torchchat-exports
185185
```
186186

187-
When we export models with AOT Inductor for servers and desktops, and Executorch for mobile and edge devices,
187+
When we export models with AOT Inductor for servers and desktops, and ExecuTorch for mobile and edge devices,
188188
we will save them in the specified directory (`${MODEL_OUT}` in our example below) as a DSO under the name `${MODEL_NAME}.so` (for AOTI-generated dynamic libraries),
189-
or as Executorch model under the name `${MODEL_NAME}.pte` (for Executorch-generated mobile/edge models).
189+
or as ExecuTorch model under the name `${MODEL_NAME}.pte` (for Executorch-generated mobile/edge models).
190190

191191
We use `[ optional input ]` to indicate optional inputs, and `[ choice 1 | choice 2 | ... ]` to indicate a choice
192192

@@ -271,7 +271,7 @@ quantization to achieve this, as described below.
271271
We export the model with the export.py script. Running this script requires you first install executorch with pybindings, see [here](#setting-up-executorch-and-runner-et).
272272
At present, when exporting a model, the export command always uses the
273273
xnnpack delegate to export. (Future versions of torchchat will support additional
274-
delegates such as Vulkan, CoreML, MPS, HTP in addition to Xnnpack as they are released for Executorch.)
274+
delegates such as Vulkan, CoreML, MPS, HTP in addition to Xnnpack as they are released for ExecuTorch.)
275275

276276
### Running the model
277277

@@ -284,7 +284,7 @@ python generate.py --checkpoint-path ${MODEL_PATH} --pte ${MODEL_OUT}/model.pte
284284
You can also run the model with the runner-et. See below under "Standalone Execution".
285285

286286
While we have shown the export and execution of a small model to a mobile/edge
287-
device supported by Executorch, most models need to be compressed to
287+
device supported by ExecuTorch, most models need to be compressed to
288288
fit in the target device's memory. We use quantization to achieve this.
289289

290290

@@ -458,7 +458,7 @@ groupsize set to 0 which uses channelwise quantization:
458458
python generate.py [--compile] --checkpoint-path ${MODEL_PATH} --prompt "Hello, my name is" --quant '{"linear:int8" : {"bitwidth": 8, "groupsize": 0}}' --device cpu
459459
```
460460

461-
Then, export as follows using Executorch for mobile backends:
461+
Then, export as follows using ExecuTorch for mobile backends:
462462
```
463463
python export.py --checkpoint-path ${MODEL_PATH} -d fp32 --quant '{"linear:int8": {"bitwidth": 8, "groupsize": 0} }' --output-pte-path ${MODEL_OUT}/${MODEL_NAME}_int8.pte
464464
```
@@ -486,7 +486,7 @@ We can do this in eager mode (optionally with `torch.compile`), we use the `line
486486
python generate.py [--compile] --checkpoint-path ${MODEL_PATH} --prompt "Hello, my name is" --quant '{"linear:int8" : {"bitwidth": 8, "groupsize": 8}}' --device cpu
487487
```
488488

489-
Then, export as follows using Executorch:
489+
Then, export as follows using ExecuTorch:
490490
```
491491
python export.py --checkpoint-path ${MODEL_PATH} -d fp32 --quant '{"linear:int8": {"bitwidth": 8, "groupsize": 0} }' --output-pte-path ${MODEL_OUT}/${MODEL_NAME}_int8-gw256.pte
492492
```
@@ -607,7 +607,7 @@ After this is done, you can run runner-et with
607607
```
608608

609609
While we have shown the export and execution of a small model to a mobile/edge
610-
device supported by Executorch, most models need to be compressed to
610+
device supported by ExecuTorch, most models need to be compressed to
611611
fit in the target device's memory. We use quantization to achieve this.
612612

613613

@@ -630,7 +630,7 @@ To run your pte model, use the following command (assuming you already generated
630630

631631
### Android
632632

633-
Check out the [tutorial on how to build an Android app running your PyTorch models with Executorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html), and give your torchchat models a spin.
633+
Check out the [tutorial on how to build an Android app running your PyTorch models with ExecuTorch](https://pytorch.org/executorch/main/llm/llama-demo-android.html), and give your torchchat models a spin.
634634

635635
![Screenshot](https://pytorch.org/executorch/main/_static/img/android_llama_app.png "Android app running Llama model")
636636

docs/quantization.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ python generate.py [--compile] --checkpoint-path ${MODEL_PATH} --prompt "Hello,
4646
4747
```
4848

49-
Then, export as follows with Executorch:
49+
Then, export as follows with ExecuTorch:
5050
```
5151
python export.py --checkpoint-path ${MODEL_PATH} -d fp32 --quant '{"embedding": {"bitwidth": 8, "groupsize": 0} }' --output-pte-path ${MODEL_OUT}/${MODEL_NAME}_emb8b-gw256.pte
5252
```
@@ -127,7 +127,7 @@ We can do this in eager mode (optionally with torch.compile), we use the linear:
127127
python generate.py [--compile] --checkpoint-path ${MODEL_PATH} --prompt "Hello, my name is" --quant '{"linear:int8" : {"bitwidth": 8, "groupsize": 0}}' --device cpu
128128
```
129129

130-
Then, export as follows using Executorch for mobile backends:
130+
Then, export as follows using ExecuTorch for mobile backends:
131131

132132
```
133133
python export.py --checkpoint-path ${MODEL_PATH} -d fp32 --quant '{"linear:int8": {"bitwidth": 8, "groupsize": 0} }' --output-pte-path ${MODEL_OUT}/${MODEL_NAME}_int8.pte
@@ -157,7 +157,7 @@ We can do this in eager mode (optionally with torch.compile), we use the linear:
157157
```
158158
python generate.py [--compile] --checkpoint-path ${MODEL_PATH} --prompt "Hello, my name is" --quant '{"linear:int8" : {"bitwidth": 8, "groupsize": 8}}' --device cpu
159159
```
160-
Then, export as follows using Executorch:
160+
Then, export as follows using ExecuTorch:
161161

162162
```
163163
python export.py --checkpoint-path ${MODEL_PATH} -d fp32 --quant '{"linear:int8": {"bitwidth": 8, "groupsize": 0} }' --output-pte-path ${MODEL_OUT}/${MODEL_NAME}_int8-gw256.pte

0 commit comments

Comments
 (0)