|
1 | 1 | # Executorch XNNPACK Delegate
|
2 | 2 |
|
3 |
| -This subtree contains the XNNPACK Delegate implementation for Executorch. XNNPACK is an optimized library of neural network inference operators for ARM and x86 platforms. It is an open source projected used by PyTorch. The delegate is the mechanism for leveraging the XNNPACK Library to accelerate operators running on CPU. |
| 3 | +This subtree contains the XNNPACK Delegate implementation for Executorch. |
| 4 | +XNNPACK is an optimized library of neural network inference operators for ARM |
| 5 | +and x86 CPUs. It is an open source project used by PyTorch. The delegate is the |
| 6 | +mechanism for leveraging the XNNPACK library to accelerate operators running on |
| 7 | +CPU. |
4 | 8 |
|
5 | 9 | ## Layout
|
6 |
| -- `runtime/` : Runtime logic use at inference. This contains all the cpp files used to build the runtime graph and execute the XNNPACK model |
7 |
| -- `partition/`: Partitioner is used to identify operators in model's graph that are suitable for lowering to XNNPACK delegate |
8 |
| - - `support_patterns.py`: Contains list of captured graph patterns that are suitable for XNNPack |
9 |
| - - `xnnpack_partitioner.py`: Contains partitioner that tags graph patterns for XNNPACK lowering |
10 |
| -- `passes/`: Contains passes which are used before preprocessing to prepare the graph for XNNPACK lowering |
| 10 | +- `runtime/` : Runtime logic used at inference. This contains all the cpp files |
| 11 | + used to build the runtime graph and execute the XNNPACK model |
| 12 | +- `partition/`: Partitioner is used to identify operators in model's graph that |
| 13 | + are suitable for lowering to XNNPACK delegate |
| 14 | + - `xnnpack_partitioner.py`: Contains partitioner that tags graph patterns |
| 15 | + for XNNPACK lowering |
| 16 | + - `configs.py`: Contains lists of op/modules for XNNPACK lowering |
| 17 | +- `passes/`: Contains passes which are used before preprocessing to prepare the |
| 18 | + graph for XNNPACK lowering |
11 | 19 | - `operators`: the directory to store all of op visitors
|
12 |
| - - `node_visitor.py`: Implementation of serializing each lowerable operator node |
| 20 | + - `node_visitor.py`: Implementation of serializing each lowerable operator |
| 21 | + node |
13 | 22 | - ...
|
14 |
| -- `serialization/`: Contains files related to serializing the XNNPACK graph representation of the PyTorch model |
| 23 | +- `serialization/`: Contains files related to serializing the XNNPACK graph |
| 24 | + representation of the PyTorch model |
15 | 25 | - `schema.fbs`: Flatbuffer schema of serialization format
|
16 |
| - - `xnnpack_graph_schema.py`: Python dataclasses mirroring the flatbuffer schema |
17 |
| - - `xnnpack_graph_serialize`: Implementation for serializing dataclasses from graph schema to flatbuffer |
| 26 | + - `xnnpack_graph_schema.py`: Python dataclasses mirroring the flatbuffer |
| 27 | + schema |
| 28 | + - `xnnpack_graph_serialize`: Implementation for serializing dataclasses |
| 29 | + from graph schema to flatbuffer |
18 | 30 | - `test/`: Tests for XNNPACK Delegate
|
19 |
| - - `test_xnnpack.py`: end-to-end tests operator implementation of the xnnpack delegate |
20 |
| - - `test_xnnpack_passes.py`: Tests for graph passes used by xnnpack |
21 |
| -- `xnnpack_preprocess.py`: Contains preprocess implementation which is called by `to_backend` on the graph or subgraph of a model returning a preprocessed blob responsible for executing the graph or subgraph at runtime |
| 31 | +- `xnnpack_preprocess.py`: Contains preprocess implementation which is called |
| 32 | + by `to_backend` on the graph or subgraph of a model returning a preprocessed |
| 33 | + blob responsible for executing the graph or subgraph at runtime |
22 | 34 |
|
23 | 35 | ## Help & Improvements
|
24 |
| -If you have problems or questions, or have suggestions for ways to make implementation and testing better, please contact [Max Ren](https://fb.workplace.com/profile.php?id=100045762936437), [Digant Desai](https://fb.workplace.com/profile.php?id=100068306324819), or [Kimish Patel](https://fb.workplace.com/profile.php?id=100030094785558) on the PyTorch Edge team. |
| 36 | +If you have problems or questions, or have suggestions for ways to make |
| 37 | +implementation and testing better, please reach out to the PyTorch Edge team or |
| 38 | +create an issue on [github](https://www.github.com/pytorch/executorch/issues). |
25 | 39 |
|
26 | 40 | ## Contributing
|
27 | 41 |
|
28 |
| -Please follow the following these steps and guidelines when adding a new operator implementation to this library. The goals of these guidelines are to |
29 |
| -- Make it straightforward to add new XNNPack operators. |
30 |
| -- Ensure that the newly added operators are of high quality, and are easy to maintain |
31 |
| -- Make it easy for users to find available available operator implementations, and to trust in their quality and behavioral stability. |
| 42 | +Please follow the following steps and guidelines when adding a new operator |
| 43 | +implementation to this library. The goals of these guidelines are to |
| 44 | +- Make it straightforward to add new XNNPACK operators. |
| 45 | +- Ensure that the newly added operators are of high quality, and are easy to |
| 46 | + maintain |
| 47 | +- Make it easy for users to find available operator implementations, and to |
| 48 | + trust in their quality and behavioral stability. |
32 | 49 |
|
33 | 50 | ### AoT and Serialization Overview
|
34 | 51 | #### Serialization:
|
35 |
| -XNNPACK delegate uses flatbuffer to serialize its nodes and values. In order to add [preprocessing](https://www.internalfb.com/code/fbsource/[d9018f0841600b95256187b9a08aeab2aa8b3c11]/fbcode/executorch/backends/xnnpack/xnnpack_preprocess.py?lines=357) support for a new operator, we must add the operator in both the flatbuffer [schema](https://www.internalfb.com/code/fbsource/[9a71ca4ec2a5284867562112946ac61f5660b881]/fbcode/executorch/backends/xnnpack/serialization/schema.fbs), as well as the mirrored python [data class](https://www.internalfb.com/code/fbsource/[9a71ca4ec2a5284867562112946ac61f5660b881]/fbcode/executorch/backends/xnnpack/serialization/xnnpack_graph_schema.py). These tables are based on the arguments to the XNNPACK Subgraph APIs. These APIs can be found [here](https://www.internalfb.com/code/fbsource/[9a71ca4ec2a5284867562112946ac61f5660b881]/fbcode/xplat/third-party/XNNPACK/XNNPACK/include/xnnpack.h?lines=722-729). We essentially serialize all the static arguments we need to call `define_{new operator}()`. |
| 52 | +XNNPACK delegate uses flatbuffer to serialize its nodes and values. In order to |
| 53 | +add |
| 54 | +[preprocessing](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/xnnpack_preprocess.py) |
| 55 | +support for a new operator, we must add the operator in both the flatbuffer |
| 56 | +[schema](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/serialization/schema.fbs), |
| 57 | +as well as the mirrored python [data |
| 58 | +class](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/serialization/xnnpack_graph_schema.py). |
| 59 | +These tables are based on the arguments to the XNNPACK Subgraph APIs. These |
| 60 | +APIs can be found |
| 61 | +[here](https://github.com/google/xnnpack/blob/master/include/xnnpack.h). We |
| 62 | +essentially serialize all the static arguments we need to call `define_{new |
| 63 | +operator}()`. |
36 | 64 |
|
37 | 65 | #### AoT Preprocess:
|
38 |
| -To add logic to preprocess new operators for the XNNPACK Delegate, we can create new node_visitors that perform the serialization of the new operator. An example can be found [here](https://www.internalfb.com/code/fbsource/[d9018f0841600b95256187b9a08aeab2aa8b3c11]/fbcode/executorch/backends/xnnpack/serialization/node_visitor.py?lines=286-314). The function of these node_visitors is to serialize all the data we define to need in the schema above. |
| 66 | +To add logic to preprocess new operators for the XNNPACK Delegate, we can |
| 67 | +create new node_visitors that perform the serialization of the new operator. An |
| 68 | +example can be found [here](). The function of these node_visitors is to |
| 69 | +serialize all the data we define to need in the schema above. |
39 | 70 |
|
40 | 71 | #### AoT Partitioner:
|
41 |
| -Xnnpack Partitioner is used to selected the pattern (like the linear module graph) in a big graph such that the selected nodes will be delegated to xnnpack. To support a new op (for example, sigmoid), add the corresponding pattern to [partition/support_pattern.py](https://www.internalfb.com/code/fbsource/fbcode/executorch/backends/xnnpack/partition/support_patterns.py?lines=121-130), which captures the sigmoid op. Then expand the [self.pattern in xnnpack_partitioner.py](https://www.internalfb.com/code/fbsource/[8a7869f9d150dd6272b56d04e2d65029a92a1550]/fbcode/executorch/backends/xnnpack/partition/xnnpack_partitioner.py?lines=23-25) with the new pattern. |
| 72 | +XnnpackPartitioner is used to select the pattern (like the linear module |
| 73 | +graph) in a big graph such that the selected nodes will be delegated to |
| 74 | +XNNPACK. To support a new op (for example, sigmoid), add the corresponding op |
| 75 | +or module to the |
| 76 | +[config.py](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/partition/configs.py), |
| 77 | +which captures the sigmoid op. |
42 | 78 |
|
43 | 79 | #### How does it work?
|
44 |
| -- Tag the nodes: in the xnnpack partitioner, there is a field called [self.patterns](https://www.internalfb.com/code/fbsource/[50683ef7e3e9baf61e1d7719e19990db3a26bbfe]/fbcode/executorch/backends/xnnpack/partition/xnnpack_partitioner.py?lines=21-29)(), which lists all ops that are supported by the current xnnpack backend in executorch. When call [xnnpackpartitioner.partition()](https://www.internalfb.com/code/fbsource/[50683ef7e3e9baf61e1d7719e19990db3a26bbfe]/fbcode/executorch/backends/xnnpack/partition/xnnpack_partitioner.py?lines=42), it will tag all the nodes that matches the patterns listed in self.pattern |
45 |
| -- Lower the nodes; when we call `to_backend(graph_module, XnnpackPartitioner)`, it will loop through all the tagged nodes, and lower the group with the same tag. |
| 80 | +- Tag the nodes: in the XNNPACK partitioner's config, which lists all ops that |
| 81 | + are supported by the current XNNPACK backend in executorch. When call |
| 82 | + `XnnpackPartitioner.partition()`, it will tag all the nodes that matches the |
| 83 | + patterns listed in self.pattern |
| 84 | +- Lower the nodes; when we call `to_backend(graph_module, XnnpackPartitioner)`, |
| 85 | + it will loop through all the tagged nodes, and lower the group with the same |
| 86 | + tag. |
46 | 87 |
|
47 | 88 |
|
48 | 89 | #### Adding Tests for newly minted operators
|
49 |
| -To test newly added operators, we can add unit tests in: [test_xnnpack.py](https://www.internalfb.com/code/fbsource/[d9018f0841600b95256187b9a08aeab2aa8b3c11]/fbcode/executorch/backends/xnnpack/test/test_xnnpack.py) |
| 90 | +To test newly added operators, we can add unit tests in: |
| 91 | +[tests](https://github.com/pytorch/executorch/tree/main/backends/xnnpack/test) |
0 commit comments