Skip to content

Commit 9383453

Browse files
cccclaifacebook-github-bot
authored andcommitted
Seperate delegate doc to two parts: vendors and users
Summary: Would like to seperate the delegate documents to two parts for delegate users and delegate developers. Reviewed By: mergennachin Differential Revision: D49619659 fbshipit-source-id: 0e21c64e9924aab9136832d094a3b2749f2b9bc1
1 parent 5714285 commit 9383453

File tree

2 files changed

+195
-191
lines changed

2 files changed

+195
-191
lines changed

docs/source/compiler-delegate-and-partitioner.md

Lines changed: 9 additions & 191 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Backend and Delegate
22

3+
Audience: Vendors, Backend Delegate developers, who are interested in integrating their own compilers and hardware as part of ExecuTorch
4+
35
Backend delegation is an entry point for backends to process and execute PyTorch
46
programs to leverage performance and efficiency benefits of specialized
57
backends and hardware, while still providing PyTorch users with an experience
@@ -73,205 +75,21 @@ virtual void destroy(__ET_UNUSED DelegateHandle* handle);
7375
7476
Once the backend is ready, they can then be registered:
7577
76-
To register the backend for AOT lowering, just simply import the backend:
77-
78-
```python
79-
from executorch.exir.backend.test.backend_with_compiler_demo import BackendWithCompilerDemo
80-
```
81-
8278
To register the backend for runtime, register via the `register_backend` API:
8379
```cpp
8480
__ET_NODISCARD Error register_backend(const Backend& backend);
8581
```
8682

87-
88-
## Frontend interfaces
89-
90-
There are three flows for delegating a program to a backend:
91-
92-
1. Lower the whole module to a backend. This is good for testing backends and
93-
the preprocessing stage.
94-
1. Lower the whole module to a backend and compose it with another module. This
95-
is good for reusing lowered modules exported from other flows.
96-
1. Lower parts of a module according to a partitioner. This is good for
97-
lowering models that include both lowerable and non-lowerable nodes, and is
98-
the most streamlined procecss.
99-
100-
### Flow 1: Lowering the whole module
101-
102-
This flow starts from a traced graph module with Edge Dialect representation. To
103-
lower it, we call the following function which returns a `LoweredBackendModule`
104-
(more documentation on this function can be found in the Python API reference):
105-
106-
```python
107-
# defined in backend_api.py
108-
def to_backend(
109-
backend_id: str,
110-
edge_program: ExportedProgram,
111-
compile_spec: List[CompileSpec],
112-
) -> LoweredBackendModule:
113-
```
114-
115-
Within this function, the backend's `preprocess()` function is called which
116-
produces a compiled blob which will be emitted to the flatbuffer binary. The
117-
lowered module can be directly captured, or be put back in a parent module to be
118-
captured. Eventually the captured module is serialized in the flatbuffer's model
119-
that can be loaded by the runtime.
120-
121-
The following is an example of this flow:
122-
123-
```python
124-
from executorch.exir.backend.backend_api import to_backend, MethodCompileSpec
125-
import executorch.exir as exir
126-
import torch
127-
128-
# The submodule runs in a specific backend. In this example, `BackendWithCompilerDemo` backend
129-
class LowerableSubModel(torch.nn.Module):
130-
def __init__(self):
131-
super().__init__()
132-
133-
def forward(self, x):
134-
return torch.sin(x)
135-
136-
# Convert the lowerable module to Edge IR Representation
137-
to_be_lowered = LowerableSubModel()
138-
example_input = (torch.ones(1), )
139-
to_be_lowered_exir_submodule = exir.capture(to_be_lowered, example_input).to_edge()
140-
141-
# Import the backend implementation
142-
from executorch.exir.backend.test.backend_with_compiler_demo import (
143-
BackendWithCompilerDemo,
144-
)
145-
lowered_module = to_backend('BackendWithCompilerDemo', to_be_lowered_exir_submodule, [])
146-
```
147-
148-
We can serialize the program to a flatbuffer format by directly running:
149-
150-
```python
151-
# Save the flatbuffer to a local file
152-
save_path = "delegate.pte"
153-
with open(save_path, "wb") as f:
154-
f.write(lowered_module.buffer())
155-
```
156-
157-
### Flow 2: Lowering the whole module and composite
158-
159-
Alternatively, after flow 1, we can compose this lowered module with another
160-
module:
161-
162-
```python
163-
# This submodule runs in executor runtime
164-
class NonLowerableSubModel(torch.nn.Module):
165-
def __init__(self, bias):
166-
super().__init__()
167-
self.bias = bias
168-
169-
def forward(self, a, b):
170-
return torch.add(torch.add(a, b), self.bias)
171-
172-
173-
# The composite module, including lower part and non-lowerpart
174-
class CompositeModel(torch.nn.Module):
175-
def __init__(self):
176-
super().__init__()
177-
self.non_lowerable = NonLowerableSubModel(torch.ones(1) * 0.3)
178-
self.lowerable = lowered_module
179-
180-
def forward(self, x):
181-
a = self.lowerable(x)
182-
b = self.lowerable(a)
183-
ret = self.non_lowerable(a, b)
184-
return a, b, ret
185-
186-
composite_model = CompositeModel()
187-
model_inputs = (torch.ones(1), )
188-
exec_prog = exir.capture(composite_model, model_inputs).to_edge().to_executorch()
189-
190-
# Save the flatbuffer to a local file
191-
save_path = "delegate.pte"
192-
with open(save_path, "wb") as f:
193-
f.write(exec_prog.buffer)
194-
```
195-
196-
### Flow 3: Partitioning
197-
198-
The third flow also starts from a traced graph module with Edge Dialect
199-
representation. To lower certain nodes in this graph module, we can use the
200-
overloaded [`to_backend`
201-
function](https://github.com/pytorch/executorch/blob/d9eef24bb720804aa7b400b05241487510ae0dc2/exir/backend/backend_api.py#L39).
202-
203-
```python
204-
def to_backend(
205-
edge_program: ExportedProgram,
206-
partitioner: Type[TPartitioner],
207-
) -> ExportedProgram:
208-
```
209-
210-
This function takes in a `Partitioner` which adds a tag to all the nodes that
211-
are meant to be lowered. It will return a `partition_tags` mapping tags to
212-
backend names and module compile specs. The tagged nodes will then be
213-
partitioned and lowered to their mapped backends using Flow 1's process.
214-
Available helper partitioner are documented
215-
[here](./compiler-custom-compiler-passes.md). These lowered modules
216-
will be inserted into the top-level module and serialized.
217-
218-
The following is an example of the flow:
219-
```python
220-
from executorch.exir.backend.backend_api import to_backend
221-
import executorch.exir as exir
222-
import torch
223-
224-
class Model(torch.nn.Module):
225-
def __init__(self):
226-
super().__init__()
227-
228-
def forward(self, x, y):
229-
x = x + y
230-
x = x * y
231-
x = x - y
232-
x = x / y
233-
x = x * y
234-
x = x + y
235-
return x
236-
237-
model = Model()
238-
model_inputs = (torch.randn(1, 3), torch.randn(1, 3))
239-
gm = exir.capture(model, model_inputs).to_edge()
240-
241-
from executorch.exir.backend.test.op_partitioner_demo import AddMulPartitionerDemo
242-
exec_prog = to_backend(gm, AddMulPartitionerDemo).to_executorch(
243-
exir.ExecutorchBackendConfig(passes=SpecPropPass())
244-
)
245-
246-
# Save the flatbuffer to a local file
247-
save_path = "delegate.pte"
248-
with open(save_path, "wb") as f:
249-
f.write(exec_prog.buffer)
250-
```
251-
252-
## Runtime
253-
254-
The serialized flatbuffer model is loaded by the ExecuTorch runtime. The
255-
preprocessed blob is directly stored in the flatbuffer, which is loaded into a
256-
call to the backend's `init()` function during model initialization stage. At
257-
the model execution stage, the initialized handled can be executed through the
258-
backend's `execute()` function.
259-
260-
To run the real model with executor:
261-
262-
```
263-
> :warning: **pybind is not ready for partner preview**: please use size_test_all_ops or executor_runner cpp binary for now. pybind to run executor will be ready before MVP
83+
The way to invoke `register_backend` It can either be static registered like
84+
```cpp
85+
namespace {
86+
auto cls = BackendWithCompiler();
87+
Backend backend{"BackendWithCompilerDemo", &cls};
88+
static auto success_with_compiler = register_backend(backend);
89+
} // namespace
26490
```
26591

26692

267-
```python
268-
# Load the program with executor runtime
269-
executorch_module = _load_for_executorch_from_buffer(flatbuffer)
270-
print("model_inputs: ", model_inputs)
271-
# Execute the program
272-
model_outputs = executorch_module.forward([*model_inputs])
273-
```
274-
27593
## Error Messages
27694

27795
If there is an error in the backend, for example, if there is any operator that

0 commit comments

Comments
 (0)