Skip to content

Cleanup doc wording and code snippets in a few locations #8832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/source/backends-xnnpack.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,10 @@ quantizer.set_global(quantization_config)
#### Quantizing a model with the XNNPACKQuantizer
After configuring the quantizer, the model can be quantized by via the `prepare_pt2e` and `convert_pt2e` APIs.
```python
from torch.ao.quantization.quantize_pt2e import (
prepare_pt2e,
convert_pt2e,
)
from torch.export import export_for_training

exported_model = export_for_training(model_to_quantize, example_inputs).module()
Expand Down
57 changes: 33 additions & 24 deletions docs/source/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,25 @@ This section is intended to describe the necessary steps to take PyTorch model a
- Run the model using the ExecuTorch runtime APIs on your development platform.
- Deploy the model to the target platform using the ExecuTorch runtime.

## System Requirements
The following are required to install the ExecuTorch host libraries, needed to export models and run from Python. Requirements for target end-user devices are backend dependent. See the appropriate backend documentation for more information.

- Python 3.10 - 3.12
- g++ version 7 or higher, clang++ version 5 or higher, or another C++17-compatible toolchain.
- Linux or MacOS operating system (Arm or x86).
- Windows is supported via WSL.

## Installation
To use ExecuTorch, you will need to install both the Python package and the appropriate platform-specific runtime libraries.
To use ExecuTorch, you will need to install both the Python package and the appropriate platform-specific runtime libraries. Pip is the recommended way to install the ExecuTorch python package.

Pip is the recommended way to install the ExecuTorch python package. This package includes the dependencies needed to export a PyTorch model, as well as Python runtime bindings for model testing and evaluation. It is common to install the package within a Python virtual environment, in order to meet the Python and dependency version requirements.
This package includes the dependencies needed to export a PyTorch model, as well as Python runtime bindings for model testing and evaluation. Consider installing ExecuTorch within a virtual environment, such as one provided by [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html#creating-environments) or [venv](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/#create-and-use-virtual-environments).

```
pip install executorch
```

To build the framework from source, see [Building From Source](using-executorch-building-from-source.md).

Backend delegates may require additional dependencies. See the appropriate backend documentation for more information.
To build the framework from source, see [Building From Source](using-executorch-building-from-source.md). Backend delegates may require additional dependencies. See the appropriate backend documentation for more information.

#### System Requirements
The following are required to install the ExecuTorch host libraries, needed to export models and run from Python. Requirements for target end-user devices are backend dependent. See the appropriate backend documentation for more information.

- Python 3.10 - 3.12
- g++ version 7 or higher, clang++ version 5 or higher, or another C++17-compatible toolchain.
- Linux or MacOS operating system (Arm or x86).
- Windows is supported via WSL.

<hr/>

Expand All @@ -44,15 +43,20 @@ ExecuTorch provides hardware acceleration for a wide variety of hardware. The mo
For mobile use cases, consider using XNNPACK for Android and Core ML or XNNPACK for iOS as a first step. See [Hardware Backends](backends-overview.md) for more information.

### Exporting
Exporting is done using Python APIs. ExecuTorch provides a high degree of customization during the export process, but the typical flow is as follows:
Exporting is done using Python APIs. ExecuTorch provides a high degree of customization during the export process, but the typical flow is as follows. This example uses the MobileNet V2 image classification model implementation in torchvision, but the process supports any [export-compliant](https://pytorch.org/docs/stable/export.html) PyTorch model.

```python
import executorch
import torch
import torchvision.models as models
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
from executorch.exir import to_edge_transform_and_lower

model = MyModel() # The PyTorch model to export
example_inputs = (torch.randn(1,3,64,64),) # A tuple of inputs
model = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
sample_inputs = (torch.randn(1, 3, 224, 224), )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the actual export call?

exported_graph = torch.export.export(model, sample_inputs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually inline in the to_edge_transform_and_lower call. I've been doing that for conciseness and have seen users doing this, but maybe it's better to break out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah didn't see it somehow, maybe it's easy to miss.

yeah, at least for examples, and getting started, breaking out could be useful


et_program = executorch.exir.to_edge_transform_and_lower(
torch.export.export(model, example_inputs),
et_program = to_edge_transform_and_lower(
torch.export.export(model, sample_inputs),
partitioner=[XnnpackPartitioner()]
).to_executorch()

Expand All @@ -62,24 +66,27 @@ with open("model.pte", "wb") as f:

If the model requires varying input sizes, you will need to specify the varying dimensions and bounds as part of the `export` call. See [Model Export and Lowering](using-executorch-export.md) for more information.

The hardware backend to target is controlled by the partitioner parameter to to\_edge\_transform\_and\_lower. In this example, the XnnpackPartitioner is used to target mobile CPUs. See the delegate-specific documentation for a full description of the partitioner and available options.
The hardware backend to target is controlled by the partitioner parameter to to\_edge\_transform\_and\_lower. In this example, the XnnpackPartitioner is used to target mobile CPUs. See the [backend-specific documentation](backends-overview.md) for information on how to use each backend.

Quantization can also be done at this stage to reduce model size and runtime. Quantization is backend-specific. See the documentation for the target backend for a full description of supported quantization schemes.

### Testing the Model

After successfully generating a .pte file, it is common to use the Python runtime APIs to validate the model on the development platform. This can be used to evaluate model accuracy before running on-device.

Inference can be run as follows:
For the MobileNet V2 model from torchvision used in this example, image inputs are expected as a normalized, float32 tensor with a dimensions of (batch, channels, height, width). The output See [torchvision.models.mobilenet_v2](https://pytorch.org/vision/main/models/generated/torchvision.models.mobilenet_v2.html) for more information on the input and output tensor format for this model.

```python
import torch
from executorch.runtime import Runtime
from typing import List

runtime = Runtime.get()

input_tensor = torch.randn(1,3,128,128)
program = runtime.load_program("/path/to/mode.pte")
input_tensor: torch.Tensor = torch.randn(1, 3, 224, 224)
program = runtime.load_program("model.pte")
method = program.load_method("forward")
outputs = method.execute([input_tensor])
outputs: List[torch.Tensor] = method.execute([input_tensor])
```


Expand All @@ -101,13 +108,15 @@ To add the library to your app, download the AAR, and add it to the gradle build

```
mkdir -p app/libs
curl https://ossci-android.s3.amazonaws.com/executorch/release/executorch-241002/executorch.aar -o app/libs/executorch.aar
curl https://ossci-android.s3.amazonaws.com/executorch/release/v0.5.0-rc3/executorch.aar -o app/libs/executorch.aar
```
And in gradle,
```
# app/build.gradle.kts
dependencies {
implementation(files("libs/executorch.aar"))
implementation("com.facebook.soloader:soloader:0.10.5")
implementation("com.facebook.fbjni:fbjni:0.5.1")
}
```

Expand Down
8 changes: 5 additions & 3 deletions docs/source/using-executorch-export.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ class Model(torch.nn.Module):
torch.nn.Conv2d(8, 16, 3),
torch.nn.ReLU(),
torch.nn.AdaptiveAvgPool2d((1,1))
)
)
self.linear = torch.nn.Linear(16, 10)

def forward(self, x):
Expand All @@ -68,12 +68,14 @@ class Model(torch.nn.Module):
y = self.linear(y)
return y

model = Model()
model = Model().eval()
inputs = (torch.randn(1,1,16,16),)
outputs = model(*inputs)
print(f"Model output: {outputs}")
```

Note that the model is set to evaluation mode using `.eval()`. Models should always be exported in evaluation mode unless performing on-device training. This mode configures certain operations with training-specific behavior, such as batch norm or dropout, to use the inference-mode configuration.

## Export and Lowering

To actually export and lower the model, call `export`, `to_edge_transform_and_lower`, and `to_executorch` in sequence. This yields an ExecuTorch program which can be serialized to a file. Putting it all together, lowering the example model above using the XNNPACK delegate for mobile CPU performance can be done as follows:
Expand All @@ -92,7 +94,7 @@ class Model(torch.nn.Module):
torch.nn.ReLU(),
torch.nn.Conv2d(8, 16, 3),
torch.nn.ReLU(),
torch.nn.AdaptiveAvgPool2d([1,1])
torch.nn.AdaptiveAvgPool2d((1,1))
)
self.linear = torch.nn.Linear(16, 10)

Expand Down
Loading