Skip to content

Commit bd87ca5

Browse files
authored
CoreML doc updates (#9463)
Addressing #8527
1 parent c45d514 commit bd87ca5

File tree

1 file changed

+157
-121
lines changed

1 file changed

+157
-121
lines changed

docs/source/backends-coreml.md

Lines changed: 157 additions & 121 deletions
Original file line numberDiff line numberDiff line change
@@ -1,164 +1,200 @@
11
# Core ML Backend
22

3-
Core ML delegate uses Core ML APIs to enable running neural networks via Apple's hardware acceleration. For more about Core ML you can read [here](https://developer.apple.com/documentation/coreml). In this tutorial, we will walk through the steps of lowering a PyTorch model to Core ML delegate
3+
Core ML delegate is the ExecuTorch solution to take advantage of Apple's [CoreML framework](https://developer.apple.com/documentation/coreml) for on-device ML. With CoreML, a model can run on CPU, GPU, and the Apple Neural Engine (ANE).
44

5+
## Features
56

6-
::::{grid} 2
7-
:::{grid-item-card} What you will learn in this tutorial:
8-
:class-card: card-prerequisites
9-
* In this tutorial you will learn how to export [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model so that it runs on Core ML backend.
10-
* You will also learn how to deploy and run the exported model on a supported Apple device.
11-
:::
12-
:::{grid-item-card} Tutorials we recommend you complete before this:
13-
:class-card: card-prerequisites
14-
* [Introduction to ExecuTorch](./intro-how-it-works.md)
15-
* [Getting Started](./getting-started.md)
16-
* [Building ExecuTorch with CMake](./using-executorch-building-from-source.md)
17-
* [ExecuTorch iOS Demo App](demo-apps-ios.md)
18-
:::
19-
::::
7+
- Dynamic dispatch to the CPU, GPU, and ANE.
8+
- Supports fp32 and fp16 computation.
209

10+
## Target Requirements
2111

22-
## Prerequisites (Hardware and Software)
12+
Below are the minimum OS requirements on various hardware for running a CoreML-delegated ExecuTorch model:
13+
- [macOS](https://developer.apple.com/macos) >= 13.0
14+
- [iOS](https://developer.apple.com/ios/) >= 16.0
15+
- [iPadOS](https://developer.apple.com/ipados/) >= 16.0
16+
- [tvOS](https://developer.apple.com/tvos/) >= 16.0
2317

24-
In order to be able to successfully build and run the ExecuTorch's Core ML backend you'll need the following hardware and software components.
18+
## Development Requirements
19+
To develop you need:
2520

26-
### Hardware:
27-
- A [mac](https://www.apple.com/mac/) system for building.
28-
- A [mac](https://www.apple.com/mac/) or [iPhone](https://www.apple.com/iphone/) or [iPad](https://www.apple.com/ipad/) or [Apple TV](https://www.apple.com/tv-home/) device for running the model.
21+
- [macOS](https://developer.apple.com/macos) >= 13.0.
22+
- [Xcode](https://developer.apple.com/documentation/xcode) >= 14.1
2923

30-
### Software:
3124

32-
- [Xcode](https://developer.apple.com/documentation/xcode) >= 14.1, [macOS](https://developer.apple.com/macos) >= 13.0 for building.
33-
- [macOS](https://developer.apple.com/macos) >= 13.0, [iOS](https://developer.apple.com/ios/) >= 16.0, [iPadOS](https://developer.apple.com/ipados/) >= 16.0, and [tvOS](https://developer.apple.com/tvos/) >= 16.0 for running the model.
34-
35-
## Setting up your developer environment
36-
37-
1. Make sure that you have completed the ExecuTorch setup tutorials linked to at the top of this page and setup the environment.
38-
2. Run `install_requirements.sh` to install dependencies required by the **Core ML** backend.
39-
40-
```bash
41-
cd executorch
42-
./backends/apple/coreml/scripts/install_requirements.sh
43-
```
44-
3. Install [Xcode](https://developer.apple.com/xcode/).
45-
4. Install Xcode Command Line Tools.
25+
Before starting, make sure you install the Xcode Command Line Tools:
4626

4727
```bash
4828
xcode-select --install
4929
```
5030

51-
## Build
52-
53-
### AOT (Ahead-of-time) components:
54-
55-
56-
**Exporting a Core ML delegated Program**:
57-
- In this step, you will lower the [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the Core ML backend and export the ExecuTorch program. You'll then deploy and run the exported program on a supported Apple device using Core ML backend.
31+
Finally you must install the CoreML backend by running the following script:
5832
```bash
59-
cd executorch
60-
61-
# Generates ./mv3_coreml_all.pte file.
62-
python3 -m examples.apple.coreml.scripts.export --model_name mv3
33+
sh ./backends/apple/coreml/scripts/install_requirements.sh
6334
```
6435

65-
- Core ML backend uses [coremltools](https://apple.github.io/coremltools/docs-guides/source/overview-coremltools.html) to lower [Edge dialect](ir-exir.md#edge-dialect) to Core ML format and then bundles it in the `.pte` file.
66-
6736

68-
### Runtime:
69-
70-
**Running a Core ML delegated Program**:
71-
1. Build the runner.
72-
```bash
73-
cd executorch
37+
----
7438

75-
# Builds `coreml_executor_runner`.
76-
./examples/apple/coreml/scripts/build_executor_runner.sh
77-
```
78-
2. Run the CoreML delegated program.
79-
```bash
80-
cd executorch
39+
## Using the CoreML Backend
8140

82-
# Runs the exported mv3 model using the Core ML backend.
83-
./coreml_executor_runner --model_path mv3_coreml_all.pte
84-
```
41+
To target the CoreML backend during the export and lowering process, pass an instance of the `CoreMLPartitioner` to `to_edge_transform_and_lower`. The example below demonstrates this process using the MobileNet V2 model from torchvision.
8542

86-
**Profiling a Core ML delegated Program**:
43+
```python
44+
import torch
45+
import torchvision.models as models
46+
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
47+
from executorch.backends.apple.coreml.partition import CoreMLPartitioner
48+
from executorch.exir import to_edge_transform_and_lower
8749

88-
Note that profiling is supported on [macOS](https://developer.apple.com/macos) >= 14.4.
50+
mobilenet_v2 = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
51+
sample_inputs = (torch.randn(1, 3, 224, 224), )
8952

90-
1. [Optional] Generate an [ETRecord](./etrecord.rst) when exporting your model.
91-
```bash
92-
cd executorch
53+
et_program = to_edge_transform_and_lower(
54+
torch.export.export(mobilenet_v2, sample_inputs),
55+
partitioner=[CoreMLPartitioner()],
56+
).to_executorch()
9357

94-
# Generates `mv3_coreml_all.pte` and `mv3_coreml_etrecord.bin` files.
95-
python3 -m examples.apple.coreml.scripts.export --model_name mv3 --generate_etrecord
58+
with open("mv2_coreml.pte", "wb") as file:
59+
et_program.write_to_file(file)
9660
```
9761

98-
2. Build the runner.
99-
```bash
100-
# Builds `coreml_executor_runner`.
101-
./examples/apple/coreml/scripts/build_executor_runner.sh
102-
```
103-
3. Run and generate an [ETDump](./etdump.md).
104-
```bash
105-
cd executorch
106-
107-
# Generate the ETDump file.
108-
./coreml_executor_runner --model_path mv3_coreml_all.pte --profile_model --etdump_path etdump.etdp
62+
### Partitioner API
63+
64+
The CoreML partitioner API allows for configuration of the model delegation to CoreML. Passing an `CoreMLPartitioner` instance with no additional parameters will run as much of the model as possible on the CoreML backend with default settings. This is the most common use-case. For advanced use cases, the partitioner exposes the following options via the [constructor](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/apple/coreml/partition/coreml_partitioner.py#L60):
65+
66+
67+
- `skip_ops_for_coreml_delegation`: Allows you to skip ops for delegation by CoreML. By default, all ops that CoreML supports will be delegated. See [here](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/apple/coreml/test/test_coreml_partitioner.py#L42) for an example of skipping an op for delegation.
68+
- `compile_specs`: A list of CompileSpec for the CoreML backend. These control low-level details of CoreML delegation, such as the compute unit (CPU, GPU, ANE), the iOS deployment target, and the compute precision (FP16, FP32). These are discussed more below.
69+
- `take_over_mutable_buffer`: A boolean that indicates whether PyTorch mutable buffers in stateful models should be converted to [CoreML MLState](https://developer.apple.com/documentation/coreml/mlstate). If set to false, mutable buffers in the PyTorch graph are converted to graph inputs and outputs to the CoreML lowered module under the hood. Generally setting take_over_mutable_buffer to true will result in better performance, but using MLState requires iOS >= 18.0, macOS >= 15.0, and XCode >= 16.0.
70+
71+
#### CoreML CompileSpec
72+
73+
A list of CompileSpec is constructed with [CoreMLBackend.generate_compile_specs](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/apple/coreml/compiler/coreml_preprocess.py#L210). Below are the available options:
74+
- `compute_unit`: this controls the compute units (CPU, GPU, ANE) that are used by CoreML. The default value is coremltools.ComputeUnit.ALL. The available options from coremltools are:
75+
- coremltools.ComputeUnit.ALL (uses the CPU, GPU, and ANE)
76+
- coremltools.ComputeUnit.CPU_ONLY (uses the CPU only)
77+
- coremltools.ComputeUnit.CPU_AND_GPU (uses both the CPU and GPU, but not the ANE)
78+
- coremltools.ComputeUnit.CPU_AND_NE (uses both the CPU and ANE, but not the GPU)
79+
- `minimum_deployment_target`: The minimum iOS deployment target (e.g., coremltools.target.iOS18). The default value is coremltools.target.iOS15.
80+
- `compute_precision`: The compute precision used by CoreML (coremltools.precision.FLOAT16, coremltools.precision.FLOAT32). The default value is coremltools.precision.FLOAT16. Note that the compute precision is applied no matter what dtype is specified in the exported PyTorch model. For example, an FP32 PyTorch model will be converted to FP16 when delegating to the CoreML backend by default. Also note that the ANE only supports FP16 precision.
81+
- `model_type`: Whether the model should be compiled to the CoreML [mlmodelc format](https://developer.apple.com/documentation/coreml/downloading-and-compiling-a-model-on-the-user-s-device) during .pte creation ([CoreMLBackend.MODEL_TYPE.COMPILED_MODEL](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/apple/coreml/compiler/coreml_preprocess.py#L71)), or whether it should be compiled to mlmodelc on device ([CoreMLBackend.MODEL_TYPE.MODEL](https://github.com/pytorch/executorch/blob/14ff52ff89a89c074fc6c14d3f01683677783dcd/backends/apple/coreml/compiler/coreml_preprocess.py#L70)). Using CoreMLBackend.MODEL_TYPE.COMPILED_MODEL and doing compilation ahead of time should improve the first time on-device model load time.
82+
83+
### Testing the Model
84+
85+
After generating the CoreML-delegated .pte, the model can be tested from Python using the ExecuTorch runtime python bindings. This can be used to sanity check the model and evaluate numerical accuracy. See [Testing the Model](using-executorch-export.md#testing-the-model) for more information.
86+
87+
----
88+
89+
### Quantization
90+
91+
To quantize a PyTorch model for the CoreML backend, use the `CoreMLQuantizer`. `Quantizers` are backend specific, and the `CoreMLQuantizer` is configured to quantize models to leverage the available quantization for the CoreML backend.
92+
93+
### 8-bit Quantization using the PT2E Flow
94+
95+
To perform 8-bit quantization with the PT2E flow, perform the following steps:
96+
97+
1) Define [coremltools.optimize.torch.quantization.LinearQuantizerConfig](https://apple.github.io/coremltools/source/coremltools.optimize.torch.quantization.html#coremltools.optimize.torch.quantization.LinearQuantizerConfig) and use to to create an instance of a `CoreMLQuantizer`.
98+
2) Use `torch.export.export_for_training` to export a graph module that will be prepared for quantization.
99+
3) Call `prepare_pt2e` to prepare the model for quantization.
100+
4) For static quantization, run the prepared model with representative samples to calibrate the quantizated tensor activation ranges.
101+
5) Call `convert_pt2e` to quantize the model.
102+
6) Export and lower the model using the standard flow.
103+
104+
The output of `convert_pt2e` is a PyTorch model which can be exported and lowered using the normal flow. As it is a regular PyTorch model, it can also be used to evaluate the accuracy of the quantized model using standard PyTorch techniques.
105+
106+
```python
107+
import torch
108+
import coremltools as ct
109+
import torchvision.models as models
110+
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
111+
from executorch.backends.apple.coreml.quantizer import CoreMLQuantizer
112+
from executorch.backends.apple.coreml.partition import CoreMLPartitioner
113+
from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
114+
from executorch.exir import to_edge_transform_and_lower
115+
from executorch.backends.apple.coreml.compiler import CoreMLBackend
116+
117+
mobilenet_v2 = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
118+
sample_inputs = (torch.randn(1, 3, 224, 224), )
119+
120+
# Step 1: Define a LinearQuantizerConfig and create an instance of a CoreMLQuantizer
121+
quantization_config = ct.optimize.torch.quantization.LinearQuantizerConfig.from_dict(
122+
{
123+
"global_config": {
124+
"quantization_scheme": ct.optimize.torch.quantization.QuantizationScheme.symmetric,
125+
"milestones": [0, 0, 10, 10],
126+
"activation_dtype": torch.quint8,
127+
"weight_dtype": torch.qint8,
128+
"weight_per_channel": True,
129+
}
130+
}
131+
)
132+
quantizer = CoreMLQuantizer(quantization_config)
133+
134+
# Step 2: Export the model for training
135+
training_gm = torch.export.export_for_training(mobilenet_v2, sample_inputs).module()
136+
137+
# Step 3: Prepare the model for quantization
138+
prepared_model = prepare_pt2e(training_gm, quantizer)
139+
140+
# Step 4: Calibrate the model on representative data
141+
# Replace with your own calibration data
142+
for calibration_sample in [torch.randn(1, 3, 224, 224)]:
143+
prepared_model(calibration_sample)
144+
145+
# Step 5: Convert the calibrated model to a quantized model
146+
quantized_model = convert_pt2e(prepared_model)
147+
148+
# Step 6: Export the quantized model to CoreML
149+
et_program = to_edge_transform_and_lower(
150+
torch.export.export(quantized_model, sample_inputs),
151+
partitioner=[
152+
CoreMLPartitioner(
153+
# iOS17 is required for the quantized ops in this example
154+
compile_specs=CoreMLBackend.generate_compile_specs(
155+
minimum_deployment_target=ct.target.iOS17
156+
)
157+
)
158+
],
159+
).to_executorch()
109160
```
110161

111-
4. Create an instance of the [Inspector API](./model-inspector.rst) by passing in the [ETDump](./etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table.
112-
```bash
113-
python examples/apple/coreml/scripts/inspector_cli.py --etdump_path etdump.etdp --etrecord_path mv3_coreml.bin
114-
```
162+
See [PyTorch 2 Export Post Training Quantization](https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html) for more information.
115163

164+
----
116165

117-
## Deploying and running on a device
166+
## Runtime integration
118167

119-
**Running the Core ML delegated Program in the Demo iOS App**:
120-
1. Please follow the [Export Model](demo-apps-ios.md#models-and-labels) step of the tutorial to bundle the exported [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) program. You only need to do the Core ML part.
168+
To run the model on-device, use the standard ExecuTorch runtime APIs. See [Running on Device](getting-started.md#running-on-device) for more information, including building the iOS frameworks.
121169

122-
2. Complete the [Build Runtime and Backends](demo-apps-ios.md#build-runtime-and-backends) section of the tutorial. When building the frameworks you only need the `coreml` option.
170+
When building from source, pass `-DEXECUTORCH_BUILD_COREML=ON` when configuring the CMake build to compile the CoreML backend.
123171

124-
3. Complete the [Final Steps](demo-apps-ios.md#final-steps) section of the tutorial to build and run the demo app.
172+
To link against the `coremldelegate` target. Due to the use of static registration, it may be necessary to link with whole-archive. This can typically be done by passing `"$<LINK_LIBRARY:WHOLE_ARCHIVE,coremldelegate>"` to `target_link_libraries`.
125173

126-
<br>**Running the Core ML delegated Program in your App**
127-
1. Build frameworks, running the following will create a `executorch.xcframework` and `coreml_backend.xcframework` in the `cmake-out` directory.
128-
```bash
129-
cd executorch
130-
./build/build_apple_frameworks.sh --coreml
131174
```
132-
2. Create a new [Xcode project](https://developer.apple.com/documentation/xcode/creating-an-xcode-project-for-an-app#) or open an existing project.
133-
134-
3. Drag the `executorch.xcframework` and `coreml_backend.xcframework` generated from Step 2 to Frameworks.
135-
136-
4. Go to the project's [Build Phases](https://developer.apple.com/documentation/xcode/customizing-the-build-phases-of-a-target) - Link Binaries With Libraries, click the + sign, and add the following frameworks:
137-
```
138-
executorch.xcframework
139-
coreml_backend.xcframework
140-
Accelerate.framework
141-
CoreML.framework
142-
libsqlite3.tbd
175+
# CMakeLists.txt
176+
add_subdirectory("executorch")
177+
...
178+
target_link_libraries(
179+
my_target
180+
PRIVATE executorch
181+
executorch_module_static
182+
executorch_tensor
183+
optimized_native_cpu_ops_lib
184+
coremldelegate)
143185
```
144-
5. Add the exported program to the [Copy Bundle Phase](https://developer.apple.com/documentation/xcode/customizing-the-build-phases-of-a-target#Copy-files-to-the-finished-product) of your Xcode target.
145186

146-
6. Please follow the [Runtime APIs Tutorial](extension-module.md) to integrate the code for loading an ExecuTorch program.
187+
No additional steps are necessary to use the backend beyond linking the target. A CoreML-delegated .pte file will automatically run on the registered backend.
147188

148-
7. Update the code to load the program from the Application's bundle.
149-
``` objective-c
150-
NSURL *model_url = [NBundle.mainBundle URLForResource:@"mv3_coreml_all" extension:@"pte"];
151-
152-
Result<executorch::extension::FileDataLoader> loader =
153-
executorch::extension::FileDataLoader::from(model_url.path.UTF8String);
154-
```
189+
---
155190

156-
8. Use [Xcode](https://developer.apple.com/documentation/xcode/building-and-running-an-app#Build-run-and-debug-your-app) to deploy the application on the device.
191+
## Advanced
157192

158-
9. The application can now run the [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model on the Core ML backend.
193+
### Extracting the mlpackage
159194

160-
<br>In this tutorial, you have learned how to lower the [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the Core ML backend, deploy, and run it on an Apple device.
161-
162-
## Frequently encountered errors and resolution.
195+
[CoreML *.mlpackage files](https://apple.github.io/coremltools/docs-guides/source/convert-to-ml-program.html#save-ml-programs-as-model-packages) can be extracted from a CoreML-delegated *.pte file. This can help with debugging and profiling for users who are more familiar with *.mlpackage files:
196+
```bash
197+
python examples/apple/coreml/scripts/extract_coreml_models.py -m /path/to/model.pte
198+
```
163199

164-
If you encountered any bugs or issues following this tutorial please file a bug/issue [here](https://github.com/pytorch/executorch/issues) with tag #coreml.
200+
Note that if the ExecuTorch model has graph breaks, there may be multiple extracted *.mlpackage files.

0 commit comments

Comments
 (0)