@@ -16,14 +16,14 @@ The 'llama runner' is a native standalone application capable of
16
16
running a model exported and compiled ahead-of-time with either
17
17
Executorch (ET) or AOT Inductor (AOTI). Which model format to use
18
18
depends on your requirements and preferences. Executorch models are
19
- optimized for portability across a range of decices , including mobile
19
+ optimized for portability across a range of devices , including mobile
20
20
and edge devices. AOT Inductor models are optimized for a particular
21
21
target architecture, which may result in better performance and
22
22
efficiency.
23
23
24
24
Building the runners is straightforward with the included cmake build
25
25
files and is covered in the next sections. We will showcase the
26
- runners using ~~ stories15M ~~ llama2 7B and llama3.
26
+ runners using llama2 7B and llama3.
27
27
28
28
## What can you do with torchchat's llama runner for native execution?
29
29
@@ -160,7 +160,7 @@ and native execution environments, respectively.
160
160
161
161
After exporting a model, you will want to verify that the model
162
162
delivers output of high quality, and works as expected. Both can be
163
- achieved with the Python environment. All torchchat Python comands
163
+ achieved with the Python environment. All torchchat Python commands
164
164
can work with exported models. Instead of loading the model from a
165
165
checkpoint or GGUF file, use the ` --dso-path model.so ` and
166
166
` --pte-path model.pte ` for loading both types of exported models. This
0 commit comments