You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-2Lines changed: 14 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -123,7 +123,7 @@ We use several variables in this example, which may be set as a preparatory step
123
123
name of the directory holding the files for the corresponding model. You *must* follow this convention to
124
124
ensure correct operation.
125
125
126
-
*`MODEL_OUT` is the location where we store model and tokenizer information for a particular model. We recommend `checkpoints/${MODEL_NAME}`
126
+
*`MODEL_DIR` is the location where we store model and tokenizer information for a particular model. We recommend `checkpoints/${MODEL_NAME}`
127
127
or any other directory you already use to store model information.
128
128
129
129
*`MODEL_PATH` describes the location of the model. Throughput the description
@@ -272,6 +272,18 @@ we cannot presently run runner/run.cpp with llama3, until we have a C/C++ tokeni
272
272
273
273
# Optimizing your model for server, desktop and mobile devices
274
274
275
+
## Model precision (dtype precision setting)_
276
+
277
+
You can generate models (for both export and generate, with eager, torch.compile, AOTI, ET, for all backends - mobile at present will primarily support fp32, with all options)
Unlike gpt-fast which uses bfloat16 as default, Torch@ uses float32 as the default. As a consequence you will have to set to `--dtype bf16` or `--dtype fp16` on server / desktop for best performance.
285
+
286
+
275
287
## Making your models fit and execute fast!
276
288
277
289
Next, we'll show you how to optimize your model for mobile execution
@@ -526,7 +538,7 @@ Check out the [tutorial on how to build an Android app running your PyTorch mode
Detailed step by step in conjunction with ET Android build, to run on simulator for Android.
541
+
Detailed step by step in conjunction with ET Android build, to run on simulator for Android.`scripts/android_example.sh` for running a model on an Android simulator (on Mac)
0 commit comments