Skip to content

Commit a40e245

Browse files
angelayifacebook-github-bot
authored andcommitted
EXIR
Differential Revision: D49602668
1 parent fd30e3a commit a40e245

File tree

6 files changed

+868
-2
lines changed

6 files changed

+868
-2
lines changed

docs/source/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@
6767
myst_enable_extensions = [
6868
"colon_fence",
6969
]
70+
myst_all_links_external=True
7071

7172
sphinx_gallery_conf = {
7273
"examples_dirs": ["tutorials_source"],

docs/source/index.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,9 @@ Topics in this section will help you get started with ExecuTorch.
101101
:hidden:
102102

103103
ir-exir
104+
ir-exir-aten-dialect
105+
ir-exir-edge-dialect
106+
ir-exir-backend-dialect
104107
ir-ops-set-definition
105108
ir-high-order-operators
106109

docs/source/ir-exir-aten-dialect.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# ATen Dialect
2+
3+
4+
## Properties:
5+
6+
An ATen dialect graph is a valid EXIR graph with the following additional properties:
7+
8+
9+
1. All operators in OpCall nodes are either from a predefined operator set,
10+
called ["Core ATen Operators”](https://pytorch.org/docs/stable/ir.html), or a
11+
registered custom operator. A registered custom operator is an operator
12+
registered into the current Pytorch eager mode runtime, usually with
13+
TORCH_LIBRARY call (implies schema).
14+
2. Every ATen operator must also have a meta kernel. A meta kernel is a
15+
function that, given the shapes of the input tensors, can return the shape of
16+
output tensor.
17+
3. Input value type must be “Pytree-able[See 2]”. As a consequence, the output
18+
types are also Pytree-able because all the operators output are pytree-able.
19+
4. Ops of Aten dialect can choose to work Dynamic dtypes, implicit type
20+
promotions and implicit broadcasting of tensors.
21+
5. All tensors memory formats are in [**Pytorch Default Dims Format:**](./ir-exir.md#memory-formats)
22+
i.e. torch.contiguous_format.
23+
24+
<table>
25+
<tr>
26+
<td>
27+
Op Set
28+
</td>
29+
<td>Canonical ATen
30+
</td>
31+
<td>Custom Op
32+
</td>
33+
<td><del>All ATen Ops</del>
34+
</td>
35+
</tr>
36+
<tr>
37+
<td>ATen
38+
</td>
39+
<td>Allowed
40+
</td>
41+
<td>Allowed, must have meta kernel
42+
</td>
43+
<td>
44+
</td>
45+
</tr>
46+
<tr>
47+
<td>Edge
48+
</td>
49+
<td>Aten + Type specializations
50+
</td>
51+
<td>Allowed
52+
</td>
53+
<td>
54+
</td>
55+
</tr>
56+
</table>
57+
58+
59+
60+
## Intent
61+
62+
This section describes what we envision ATen dialect is used for.
63+
64+
ATen dialect will be used as the entry point of the executorch compilation
65+
pipeline, it is the first time an eager mode Pytorch program becomes an EXIR
66+
graph. At this stage, functionalization is performed, so all the tensor aliases
67+
are made a copy of. Therefore, all tensors are converted to continuous format.
68+
69+
The goal of this dialect is to capture users' programs as faithfully as possible
70+
(while remaining valid EXIR). Registered Custom Operators that user has called
71+
in eager mode will preserve as-is in ATen dialect. However, we should refrain
72+
from adding custom ops in the graph via passes.
73+
74+
For now, the function of ATen dialect is to further lower to edge dialect.
75+
However, in the future we can see this one as the common integration point for
76+
other export use cases.
77+
78+
## ATen Operator Definition
79+
80+
[under construction]
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Backend Dialect
2+
3+
4+
## Properties
5+
Backend dialect is the name we gave to the `ExportedProgram` in Edge dialect, after optional **target specific** passes. The difference between backend dialect and edge dialect is that backend dialect is target-aware and may contain operators or submodules that are only meaningful to the target backend. Backend specific operators are new components we may see in a backend dialect, comparing with Edge dialect. They are a set of operators for the target backend.
6+
7+
Another property to notice is that the memory formats of the tensor can be any format (this is subject to change in the near future when we introduce dim order to backend dialect).
8+
9+
10+
## Intent
11+
12+
This dialect allows introduction of operators that do not conform to the schema defined in the canonical ATen operator set, and are not showing up in any of the dialects above (ATen dialect and edge dialect). Consider to use backend operators if your use case satisfies one or more of the following criteria:
13+
14+
1. Your backend provides a library that optimizes a certain operator that is equivalent to a subgraph. E.g., linear_relu (equivalent to linear + relu) that can be executed faster on a certain backend.
15+
2. There's a need to retrace the graph module after it is already lowered to a backend. When we retrace, backend operators can transform back to the original subgraph (in ATen dialect) where normal custom op doesn't take care of that.
16+
3. Your backend specific operator doesn't have a generic CPU kernel but only a kernel for a certain backend. Using backend operator can workaround this issue by using the original subgraph as default kernel and keep the graph module runnable.
17+
18+
19+
## How to use
20+
21+
To lower edge ops to backend ops, a pass will perform pattern matching to identify the edge ops of interest in the graph, and then replace them with equivalent backend operators. There are two APIs to register such passes:
22+
23+
* `transform()`. An API on `ExportProgram` that allows users to provide custom passes. Note that this is not guarded by any validator so the soundness of the program is not guaranteed.
24+
* [`ExecutorchBackendConfig.passes`](https://github.com/pytorch/executorch/blob/main/exir/capture/_config.py#L40). If added here, the pass will be part of the lowering process from backend dialect to `ExecutorchProgram`.
25+
26+
Example: one of such passes is `QuantFusion`. This pass takes a "canonical quantization pattern", ie. "dequant - some_op - quant" and fuse this pattern into a single operator that is backend specific, i.e. `quantized_decomposed::some_op`. You can find more details [here](./quantization-custom-quantization.md). Another simpler example is [here](https://github.com/pytorch/executorch/blob/main/exir/passes/replace_edge_with_backend_pass.py#L20) where we replace sym_size operators to the ones that are understood by ExecuTorch.
27+
28+
## API
29+
30+
We provide a decorator `bind_pattern_to_op` to help users to easily register their backend operators into EXIR. This decorator takes:
31+
* a `torch.Library` object, it indicates which library or namespace this backend operator belongs to.
32+
* a name or schema. If we already defined the schema of the backend operator in the `torch.Library` object, only a name is needed. Otherwise we can register the schema if a schema string is being passed in.
33+
34+
This decorator should be added to the pattern we are trying to match (and then lower to this backend op) on edge dialect. This way we are registering this pattern as a `CompositeImplicitAutograd` kernel for this backend operator.
35+
36+
Then the operator can be accessed/used from the passes. The `CompositeImplicitAutograd` kernel makes sure:
37+
1. No need for the user to write a (CPU) runnable kernel
38+
2. Ensures the retracability of `ExportProgram`. Once retraced, the backend operator will be decomposed into the ATen ops used in the pattern.
39+
40+
## Op Set
41+
Unlike edge dialect where we have a well defined op set, for backend dialect, since it is target-aware we will be allowing user to use our API to register target-aware ops and they will be grouped by namespaces. Here are some examples: `executorch_prims` are ops that are used by ExecuTorch runtime to perform operation on `SymInt`s. `quantized_decomposed` are ops that fuses edge operators for quantization purpose and are meaningful to targets that support quantization.
42+
43+
* `executorch_prims::add.int(SymInt a, SymInt b) -> SymInt`
44+
* pattern: builtin.add
45+
* backend: executor
46+
* `executorch_prims::mul.int(SymInt a, SymInt b) -> SymInt`
47+
* pattern: builtin.mul
48+
* backend: executor
49+
* `executorch_prims::sub.int(SymInt a, SymInt b) -> SymInt`
50+
* pattern: builtin.sub
51+
* backend: executor
52+
* `executorch_prims::floordiv.int(SymInt a, SymInt b) -> SymInt`
53+
* pattern: builtin.floordiv
54+
* backend: executor
55+
* `executorch_prims::gt.int(SymInt a, SymInt b) -> bool`
56+
* pattern: builtin.gt
57+
* backend: executor
58+
* `executorch_prims::lt.int(SymInt a, SymInt b) -> bool`
59+
* pattern: builtin.lt
60+
* backend: executor
61+
* `executorch_prims::ge.int(SymInt a, SymInt b) -> bool`
62+
* pattern: builtin.ge
63+
* backend: executor
64+
* `executorch_prims::le.int(SymInt a, SymInt b) -> bool`
65+
* pattern: builtin.le
66+
* backend: executor
67+
* `executorch_prims::eq.int(SymInt a, SymInt b) -> bool`
68+
* pattern: builtin.eq
69+
* backend: executor
70+
* `quantized_decomposed::embedding_byte(Tensor weight, Tensor weight_scales, Tensor weight_zero_points, int weight_quant_min, int weight_quant_max, Tensor indices) -> Tensor`
71+
* pattern: [source](https://github.com/pytorch/executorch/blob/main/exir/passes/_quant_patterns_and_replacements.py)
72+
* backend: quantization
73+
* `quantized_decomposed::add(Tensor a, float a_scale, int a_zero_point, int a_quant_min, int a_quant_max, Tensor b, float b_scale, int b_zero_point, int b_quant_min, int b_quant_max, float out_scale, int out_zero_point, int out_quant_min, int out_quant_max) -> Tensor qc`
74+
* pattern: [source](https://github.com/pytorch/executorch/blob/main/exir/passes/_quant_patterns_and_replacements.py)
75+
* backend: quantization
76+
* `quantized_decomposed::add.scalar(Tensor qa, float a_scale, int a_zero_point, int a_quant_min, int a_quant_max, ScalarType a_dtype, Scalar b, float out_scale, int out_zero_point, int out_quant_min, int out_quant_max, ScalarType out_dtype) -> Tensor`
77+
* pattern: [source](https://github.com/pytorch/executorch/blob/main/exir/passes/_quant_patterns_and_replacements.py)
78+
* backend: quantization
79+
* `quantized_decomposed::add_relu(Tensor a, float a_scale, int a_zero_point, int a_quant_min, int a_quant_max, Tensor b, float b_scale, int b_zero_point, int b_quant_min, int b_quant_max, float out_scale, int out_zero_point, int out_quant_min, int out_quant_max) -> Tensor qc`
80+
* pattern: [source](https://github.com/pytorch/executorch/blob/main/exir/passes/_quant_patterns_and_replacements.py)
81+
* backend: quantization

docs/source/ir-exir-edge-dialect.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Edge dialect
2+
3+
Edge dialect is a dialect of EXIR satifying the following properties:
4+
5+
## Properties
6+
7+
1. All operators in OpCall nodes are either from a predefined operator set,
8+
called **“Edge Operators”**, or a registered custom operator. An Edge operator is a
9+
ATen operator with dtype specialization.
10+
2. Input and output of the graph, and as well as to every node, cannot be Scalar. I.e.
11+
All scalar types (such as float, int) are converted to Tensor.
12+
13+
## Intent
14+
15+
This dialect is meant to introduce specializations that are useful for Edge
16+
devices but not necessarily for general (server) export.
17+
However, we still withhold specializing further to each different hardware.
18+
In other words, we don’t want to introduce any new hardware dependent concepts or data;
19+
besides those already present in users’ original python program.
20+
21+
## How to use
22+
23+
A GraphModule in EXIR edge dialect is represented with `torch.fx.GraphModule` Python class
24+
in memory. To obtain such a class, one start with a `torch.nn.Module`:
25+
26+
```python
27+
import torch
28+
from executorch import exir
29+
30+
class MyModule(torch.nn.Module):
31+
...
32+
a = MyModule()
33+
tracing_inputs = (torch.rand(2, 2),)
34+
edge_dialect_module = exir.capture(a, tracing_inputs).to_edge().module
35+
```
36+
37+
As we can see if no input is provided to `to_edge()` API, the lowering process from ATen dialect to edge dialect should be invisible to the user. However we provide some knobs for advanced usage:
38+
39+
* `EdgeCompileConfig.passes`
40+
User defined graph transformation goes in here. Order matters. Note: if the custom pass is touching `node.target`, be aware that all of the `node.target` at this stage are "Edge ops" (more details below) and not torch ops like in ATen dialect. Tutorial on pass writing can be found [here](./compiler-custom-compiler-passes.md). After all these passes are executed, `to_edge()` will make sure the graph is still valid.
41+
42+
* `EdgeCompileConfig._check_ir_validity`
43+
Default value is true. If set to false, graph validaity check will be turned off. Turn this flag off with caution, since the graph may become invalid after `to_edge()`.
44+
45+
## Edge Operator
46+
47+
As mentioned before, an edge operator is an ATen core operator with type specialization. This means the instance of edge operator contains a set of dtype constraints, to describe all the tensor dtypes supported by both ExecuTorch runtime and their ATen kernels. These dtype constraints are expressed in a DSL defined in [edge.yaml](https://github.com/pytorch/executorch/blob/main/exir/dialects/edge/edge.yaml). Here's an example of the dtype constraints:
48+
49+
```
50+
- func: sigmoid
51+
namespace: edge
52+
inherits: aten::sigmoid
53+
type_alias:
54+
T0: [Bool, Byte, Char, Int, Long, Short]
55+
T1: [Double, Float]
56+
T2: [Float]
57+
type_constraint:
58+
- self: T0
59+
__ret_0: T2
60+
- self: T1
61+
__ret_0: T1
62+
```
63+
This is saying if `self` tensor is one of the type `Bool, Byte, Char, Int, Long, Short`, then the return tensor would be `Float`. If `self` is one of `Double, Float`, the return tensor will be the same dtype.
64+
65+
After these dtype constraints are collected and documented in edge.yaml, EXIR consumes it, load them into EXIR Edge operators. This is convenient for developers to learn the supported dtypes of any argument in Edge op schema. For example we can do:
66+
67+
```python
68+
from executorch.exir.dialects._ops import ops as exir_ops # import dialects ops
69+
sigmoid = exir_ops.edge.aten.sigmoid.default
70+
print(sigmoid._schema)
71+
# aten::sigmoid(Tensor self) -> Tensor
72+
self_arg = sigmoid._schema.arguments[0]
73+
_return = sigmoid._schema.returns[0]
74+
75+
print(self_arg.allowed_types)
76+
# {torch.float32, torch.int8, torch.float64, torch.int16, torch.int32, torch.int64, torch.uint8, torch.bool}
77+
78+
print(_return.allowed_types)
79+
# {torch.float32, torch.float64}
80+
```
81+
82+
These constraints are helpful for someone who wants to write a custom kernel for this operator. Also inside EXIR, we offer a validator to check if the graph is still complying with these dtype constraints, after custom transformations.
83+
84+
## Op Set (WIP)
85+
86+
Check out [edge.yaml](https://github.com/pytorch/executorch/blob/main/exir/dialects/edge/edge.yaml) for the complete list of operators having dtype constraints specified. We are gradually expanding this operator set and targeting to provide dtype constraints for all core ATen ops.

0 commit comments

Comments
 (0)