Skip to content

Commit 17a9c86

Browse files
mikekgfbmalfet
authored andcommitted
Update README.md (#116)
Update README.md Update README.md (#118) Update README.md Update README.md (#121) Update REAME based on #107
1 parent 682dabb commit 17a9c86

File tree

1 file changed

+14
-2
lines changed

1 file changed

+14
-2
lines changed

README.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ We use several variables in this example, which may be set as a preparatory step
123123
name of the directory holding the files for the corresponding model. You *must* follow this convention to
124124
ensure correct operation.
125125

126-
* `MODEL_OUT` is the location where we store model and tokenizer information for a particular model. We recommend `checkpoints/${MODEL_NAME}`
126+
* `MODEL_DIR` is the location where we store model and tokenizer information for a particular model. We recommend `checkpoints/${MODEL_NAME}`
127127
or any other directory you already use to store model information.
128128

129129
* `MODEL_PATH` describes the location of the model. Throughput the description
@@ -272,6 +272,18 @@ we cannot presently run runner/run.cpp with llama3, until we have a C/C++ tokeni
272272

273273
# Optimizing your model for server, desktop and mobile devices
274274

275+
## Model precision (dtype precision setting)_
276+
277+
You can generate models (for both export and generate, with eager, torch.compile, AOTI, ET, for all backends - mobile at present will primarily support fp32, with all options)
278+
specify the precision of the model with
279+
```
280+
python generate.py --dtype [bf16 | fp16 | fp32] ...
281+
python export.py --dtype [bf16 | fp16 | fp32] ...
282+
```
283+
284+
Unlike gpt-fast which uses bfloat16 as default, Torch@ uses float32 as the default. As a consequence you will have to set to `--dtype bf16` or `--dtype fp16` on server / desktop for best performance.
285+
286+
275287
## Making your models fit and execute fast!
276288

277289
Next, we'll show you how to optimize your model for mobile execution
@@ -526,7 +538,7 @@ Check out the [tutorial on how to build an Android app running your PyTorch mode
526538

527539
![Screenshot](https://pytorch.org/executorch/main/_static/img/android_llama_app.png "Android app running Llama model")
528540

529-
Detailed step by step in conjunction with ET Android build, to run on simulator for Android.
541+
Detailed step by step in conjunction with ET Android build, to run on simulator for Android. `scripts/android_example.sh` for running a model on an Android simulator (on Mac)
530542

531543

532544
### iOS

0 commit comments

Comments
 (0)