Skip to content

Commit e20b4e2

Browse files
GregoryComerpytorchbot
authored andcommitted
Update XNNPACK docs to use to_edge_transform_and_lower API (#5344)
Summary: Quick doc update to use the new to_edge_transform_and_lower API, since we recommend this path now. Pull Request resolved: #5344 Test Plan: Rendered doc for this PR: https://docs-preview.pytorch.org/pytorch/executorch/5344/tutorial-xnnpack-delegate-lowering.html. Reviewed By: mcr229 Differential Revision: D62634494 Pulled By: GregoryComer fbshipit-source-id: c28881a8be5b6398da6d506819c28d085ff2762e (cherry picked from commit 4357230)
1 parent e0e19ed commit e20b4e2

File tree

1 file changed

+21
-16
lines changed

1 file changed

+21
-16
lines changed

docs/source/tutorial-xnnpack-delegate-lowering.md

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -25,17 +25,18 @@ import torchvision.models as models
2525
from torch.export import export, ExportedProgram
2626
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
2727
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
28-
from executorch.exir import EdgeProgramManager, ExecutorchProgramManager, to_edge
28+
from executorch.exir import EdgeProgramManager, ExecutorchProgramManager, to_edge_transform_and_lower
2929
from executorch.exir.backend.backend_api import to_backend
3030

3131

3232
mobilenet_v2 = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
3333
sample_inputs = (torch.randn(1, 3, 224, 224), )
3434

3535
exported_program: ExportedProgram = export(mobilenet_v2, sample_inputs)
36-
edge: EdgeProgramManager = to_edge(exported_program)
37-
38-
edge = edge.to_backend(XnnpackPartitioner())
36+
edge: EdgeProgramManager = to_edge_transform_and_lower(
37+
exported_program,
38+
partitioner=[XnnpackPartitioner()],
39+
)
3940
```
4041

4142
We will go through this example with the [MobileNetV2](https://pytorch.org/hub/pytorch_vision_mobilenet_v2/) pretrained model downloaded from the TorchVision library. The flow of lowering a model starts after exporting the model `to_edge`. We call the `to_backend` api with the `XnnpackPartitioner`. The partitioner identifies the subgraphs suitable for XNNPACK backend delegate to consume. Afterwards, the identified subgraphs will be serialized with the XNNPACK Delegate flatbuffer schema and each subgraph will be replaced with a call to the XNNPACK Delegate.
@@ -47,16 +48,18 @@ GraphModule(
4748
(lowered_module_1): LoweredBackendModule()
4849
)
4950

50-
def forward(self, arg314_1):
51+
52+
53+
def forward(self, b_features_0_1_num_batches_tracked, ..., x):
5154
lowered_module_0 = self.lowered_module_0
52-
executorch_call_delegate = torch.ops.higher_order.executorch_call_delegate(lowered_module_0, arg314_1); lowered_module_0 = arg314_1 = None
53-
getitem = executorch_call_delegate[0]; executorch_call_delegate = None
54-
aten_view_copy_default = executorch_exir_dialects_edge__ops_aten_view_copy_default(getitem, [1, 1280]); getitem = None
55-
aten_clone_default = executorch_exir_dialects_edge__ops_aten_clone_default(aten_view_copy_default); aten_view_copy_default = None
5655
lowered_module_1 = self.lowered_module_1
57-
executorch_call_delegate_1 = torch.ops.higher_order.executorch_call_delegate(lowered_module_1, aten_clone_default); lowered_module_1 = aten_clone_default = None
58-
getitem_1 = executorch_call_delegate_1[0]; executorch_call_delegate_1 = None
59-
return (getitem_1,)
56+
executorch_call_delegate_1 = torch.ops.higher_order.executorch_call_delegate(lowered_module_1, x); lowered_module_1 = x = None
57+
getitem_53 = executorch_call_delegate_1[0]; executorch_call_delegate_1 = None
58+
aten_view_copy_default = executorch_exir_dialects_edge__ops_aten_view_copy_default(getitem_53, [1, 1280]); getitem_53 = None
59+
aten_clone_default = executorch_exir_dialects_edge__ops_aten_clone_default(aten_view_copy_default); aten_view_copy_default = None
60+
executorch_call_delegate = torch.ops.higher_order.executorch_call_delegate(lowered_module_0, aten_clone_default); lowered_module_0 = aten_clone_default = None
61+
getitem_52 = executorch_call_delegate[0]; executorch_call_delegate = None
62+
return (getitem_52,)
6063
```
6164

6265
We print the graph after lowering above to show the new nodes that were inserted to call the XNNPACK Delegate. The subgraphs which are being delegated to XNNPACK are the first argument at each call site. It can be observed that the majority of `convolution-relu-add` blocks and `linear` blocks were able to be delegated to XNNPACK. We can also see the operators which were not able to be lowered to the XNNPACK delegate, such as `clone` and `view_copy`.
@@ -75,7 +78,7 @@ The XNNPACK delegate can also execute symmetrically quantized models. To underst
7578

7679
```python
7780
from torch.export import export_for_training
78-
from executorch.exir import EdgeCompileConfig
81+
from executorch.exir import EdgeCompileConfig, to_edge_transform_and_lower
7982

8083
mobilenet_v2 = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
8184
sample_inputs = (torch.randn(1, 3, 224, 224), )
@@ -111,9 +114,11 @@ Quantization requires a two stage export. First we use the `export_for_training`
111114

112115
```python
113116
# Continued from earlier...
114-
edge = to_edge(export(quantized_mobilenetv2, sample_inputs), compile_config=EdgeCompileConfig(_check_ir_validity=False))
115-
116-
edge = edge.to_backend(XnnpackPartitioner())
117+
edge = to_edge_transform_and_lower(
118+
export(quantized_mobilenetv2, sample_inputs),
119+
compile_config=EdgeCompileConfig(_check_ir_validity=False),
120+
partitioner=[XnnpackPartitioner()]
121+
)
117122

118123
exec_prog = edge.to_executorch()
119124

0 commit comments

Comments
 (0)