Skip to content

Commit 5eac329

Browse files
authored
Improvements for readability in ADVANCED-USERS.md (#1393)
* Various spelling corrections * Remove empty performance tables * Remove CONTRIBUTING section that is covered in the project root README
1 parent f821163 commit 5eac329

File tree

1 file changed

+18
-72
lines changed

1 file changed

+18
-72
lines changed

docs/ADVANCED-USERS.md

Lines changed: 18 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@ Torchchat is currently in a pre-release state and under extensive development.
1818
[shell default]: TORCHCHAT_ROOT=${PWD} ./torchchat/utils/scripts/install_et.sh
1919

2020

21-
This is the advanced users guide, if you're looking to get started
21+
This is the advanced users' guide, if you're looking to get started
2222
with LLMs, please refer to the README at the root directory of the
2323
torchchat distro. This is an advanced user guide, so we will have
24-
many more concepts and options to discuss and taking advantage of them
24+
many more concepts and options to discuss and take advantage of them
2525
may take some effort.
2626

2727
We welcome community contributions of all kinds. If you find
@@ -41,7 +41,7 @@ While we strive to support a broad range of models, we can't test them
4141
all. We classify supported models as tested ✅, work in progress 🚧 or
4242
some restrictions ❹.
4343

44-
We invite community contributions of new model suport and test results!
44+
We invite community contributions of new model support and test results!
4545

4646
| Model | Tested | Eager | torch.compile | AOT Inductor | ExecuTorch | Fits on Mobile |
4747
|-----|--------|-------|-----|-----|-----|-----|
@@ -86,7 +86,7 @@ Server C++ runtime | n/a | run.cpp model.pte | ✅ |
8686
Mobile C++ runtime | n/a | app model.pte | ✅ |
8787
Mobile C++ runtime | n/a | app + AOTI | 🚧 |
8888

89-
**Getting help:** Each command implements the --help option to give addititonal information about available options:
89+
**Getting help:** Each command implements the --help option to give additional information about available options:
9090

9191
[skip default]: begin
9292
```
@@ -96,8 +96,8 @@ python3 torchchat.py [ export | generate | chat | eval | ... ] --help
9696

9797
Exported models can be loaded back into torchchat for chat or text
9898
generation, letting you experiment with the exported model and valid
99-
model quality. The python interface is the same in all cases and is
100-
used for testing nad test harnesses too.
99+
model quality. The Python interface is the same in all cases and is
100+
used for testing and test harnesses, too.
101101

102102
Torchchat comes with server C++ runtimes to execute AOT Inductor and
103103
ExecuTorch models. A mobile C++ runtimes allow you to deploy
@@ -115,7 +115,7 @@ Some common models are recognized by torchchat based on their filename
115115
through `Model.from_name()` to perform a fuzzy match against a
116116
table of known model architectures. Alternatively, you can specify the
117117
index into that table with the option `--params-table ${INDEX}` where
118-
the index is the lookup key key in the [the list of known
118+
the index is the lookup key in the [the list of known
119119
pconfigurations](https://github.com/pytorch/torchchat/tree/main/torchchat/model_params)
120120
For example, for the stories15M model, this would be expressed as
121121
`--params-table stories15M`. (We use the model constructor
@@ -237,7 +237,7 @@ which chooses the best 16-bit floating point type.
237237

238238
The virtual device fast and virtual floating point data types fast and
239239
fast16 are best used for eager/torch.compiled execution. For export,
240-
specify the your device choice for the target system with --device for
240+
specify your device choice for the target system with --device for
241241
AOTI-exported DSO models, and using ExecuTorch delegate selection for
242242
ExecuTorch-exported PTE models.
243243

@@ -250,8 +250,7 @@ python3 torchchat.py generate [--compile] --checkpoint-path ${MODEL_PATH} --prom
250250
To improve performance, you can compile the model with `--compile`
251251
trading off the time to first token processed with time per token. To
252252
improve performance further, you may also compile the prefill with
253-
`--compile_prefill`. This will increase further compilation times though. The
254-
`--compile-prefill` option is not compatible with `--prefill-prefill`.
253+
`--compile-prefill`. This will increase further compilation times though.
255254

256255
Parallel prefill is not yet supported by exported models, and may be
257256
supported in a future release.
@@ -265,7 +264,7 @@ the introductory README.
265264
In addition to running eval on models in eager mode and JIT-compiled
266265
mode with `torch.compile()`, you can also load dso and pte models back
267266
into the PyTorch to evaluate the accuracy of exported model objects
268-
(e.g., after applying quantization or other traqnsformations to
267+
(e.g., after applying quantization or other transformations to
269268
improve speed or reduce model size).
270269

271270
Loading exported models back into a Python-based Pytorch allows you to
@@ -297,14 +296,14 @@ for ExecuTorch.)
297296

298297
We export the stories15M model with the following command for
299298
execution with the ExecuTorch runtime (and enabling execution on a
300-
wide range of community and vendor supported backends):
299+
wide range of community and vendor-supported backends):
301300

302301
```
303302
python3 torchchat.py export --checkpoint-path ${MODEL_PATH} --output-pte-path ${MODEL_NAME}.pte
304303
```
305304

306305
Alternatively, we may generate a native instruction stream binary
307-
using AOT Inductor for CPU oor GPUs (the latter using Triton for
306+
using AOT Inductor for CPU or GPUs (the latter using Triton for
308307
optimizations such as operator fusion):
309308

310309
```
@@ -319,10 +318,10 @@ the exported model artifact back into a model container with a
319318
compatible API surface for the `model.forward()` function. This
320319
enables users to test, evaluate and exercise the exported model
321320
artifact with familiar interfaces, and in conjunction with
322-
pre-exiisting Python model unit tests and common environments such as
321+
pre-existing Python model unit tests and common environments such as
323322
Jupyter notebooks and/or Google colab.
324323

325-
Here is how to load an exported model into the python environment on the example of using an exported model with `generate.oy`.
324+
Here is how to load an exported model into the Python environment using an exported model with the `generate` command.
326325

327326
```
328327
python3 torchchat.py generate --checkpoint-path ${MODEL_PATH} --pte-path ${MODEL_NAME}.pte --device cpu --prompt "Once upon a time"
@@ -452,7 +451,7 @@ strategies:
452451
You can find instructions for quantizing models in
453452
[docs/quantization.md](file:///./quantization.md). Advantageously,
454453
quantization is available in eager mode as well as during export,
455-
enabling you to do an early exploration of your quantization setttings
454+
enabling you to do an early exploration of your quantization settings
456455
in eager mode. However, final accuracy should always be confirmed on
457456
the actual execution target, since all targets have different build
458457
processes, compilers, and kernel implementations with potentially
@@ -464,9 +463,8 @@ significant impact on accuracy.
464463

465464
## Native (Stand-Alone) Execution of Exported Models
466465

467-
Refer to the [README](README.md] for an introduction toNative
468-
execution on servers, desktops and laptops is described under
469-
[runner-build.md]. Mobile and Edge executipon for Android and iOS are
466+
Refer to the [README](README.md] for an introduction to native
467+
execution on servers, desktops, and laptops. Mobile and Edge execution for Android and iOS are
470468
described under [torchchat/edge/docs/Android.md] and [torchchat/edge/docs/iOS.md], respectively.
471469

472470

@@ -475,7 +473,7 @@ described under [torchchat/edge/docs/Android.md] and [torchchat/edge/docs/iOS.md
475473

476474
PyTorch and ExecuTorch support a broad range of devices for running
477475
PyTorch with python (using either eager or eager + `torch.compile`) or
478-
in a python-free environment with AOT Inductor and ExecuTorch.
476+
in a Python-free environment with AOT Inductor and ExecuTorch.
479477

480478

481479
| Hardware | OS | Eager | Eager + Compile | AOT Compile | ET Runtime |
@@ -499,58 +497,6 @@ in a python-free environment with AOT Inductor and ExecuTorch.
499497
*Key*: n/t -- not tested
500498

501499

502-
## Runtime performance with Llama 7B, in tokens per second (4b quantization)
503-
504-
| Hardware | OS | eager | eager + compile | AOT compile | ET Runtime |
505-
|-----|------|-----|-----|-----|-----|
506-
| x86 | Linux | ? | ? | ? | ? |
507-
| x86 | macOS | ? | ? | ? | ? |
508-
| aarch64 | Linux | ? | ? | ? | ? |
509-
| aarch64 | macOS | ? | ? | ? | ? |
510-
| AMD GPU | Linux | ? | ? | ? | ? |
511-
| Nvidia GPU | Linux | ? | ? | ? | ? |
512-
| MPS | macOS | ? | ? | ? | ? |
513-
| MPS | iOS | ? | ? | ? | ? |
514-
| aarch64 | Android | ? | ? | ? | ? |
515-
| Mobile GPU (Vulkan) | Android | ? | ? | ? | ? |
516-
| CoreML | iOS | | ? | ? | ? | ? |
517-
| Hexagon DSP | Android | | ? | ? | ? | ? |
518-
| Raspberry Pi 4/5 | Raspbian | ? | ? | ? | ? |
519-
| Raspberry Pi 4/5 | Android | ? | ? | ? | ? |
520-
| ARM 32b (up to v7) | any | | ? | ? | ? | ? |
521-
522-
523-
## Runtime performance with Llama3, in tokens per second (4b quantization)
524-
525-
| Hardware | OS | eager | eager + compile | AOT compile | ET Runtime |
526-
|-----|------|-----|-----|-----|-----|
527-
| x86 | Linux | ? | ? | ? | ? |
528-
| x86 | macOS | ? | ? | ? | ? |
529-
| aarch64 | Linux | ? | ? | ? | ? |
530-
| aarch64 | macOS | ? | ? | ? | ? |
531-
| AMD GPU | Linux | ? | ? | ? | ? |
532-
| Nvidia GPU | Linux | ? | ? | ? | ? |
533-
| MPS | macOS | ? | ? | ? | ? |
534-
| MPS | iOS | ? | ? | ? | ? |
535-
| aarch64 | Android | ? | ? | ? | ? |
536-
| Mobile GPU (Vulkan) | Android | ? | ? | ? | ? |
537-
| CoreML | iOS | | ? | ? | ? | ? |
538-
| Hexagon DSP | Android | | ? | ? | ? | ? |
539-
| Raspberry Pi 4/5 | Raspbian | ? | ? | ? | ? |
540-
| Raspberry Pi 4/5 | Android | ? | ? | ? | ? |
541-
| ARM 32b (up to v7) | any | | ? | ? | ? | ? |
542-
543-
544-
545-
546-
# CONTRIBUTING to torchchat
547-
548-
We welcome any feature requests, bug reports, or pull requests from
549-
the community. See the [CONTRIBUTING](CONTRIBUTING.md) for
550-
instructions how to contribute to torchchat.
551-
552-
553-
554500
# LICENSE
555501

556502
Torchchat is released under the [BSD 3 license](./LICENSE). However

0 commit comments

Comments
 (0)