Skip to content

Commit f8c345b

Browse files
authored
Arm backend: Refactor ArmBackend (#6495)
Refactor ArmBackend/Partition/Quantizer Split up ArmBackend into separate TOSA and Ethos-U classes. Move partitioner and backend to separate files and remove Arm prefix. Add compile_spec argument to EthosUQuantizer Signed-off-by: Per Åstrand <[email protected]>
1 parent 29e7afa commit f8c345b

28 files changed

+538
-322
lines changed

backends/arm/CMakeLists.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright 2023 Arm Limited and/or its affiliates.
1+
# Copyright 2023, 2025 Arm Limited and/or its affiliates.
22
#
33
# This source code is licensed under the BSD-style license found in the
44
# LICENSE file in the root directory of this source tree.
@@ -22,7 +22,7 @@ set(THIRD_PARTY_ROOT "${CMAKE_CURRENT_SOURCE_DIR}/third-party")
2222
set(DRIVER_ETHOSU_INCLUDE_DIR "${THIRD_PARTY_ROOT}/ethos-u-core-driver/include")
2323
include_directories(${DRIVER_ETHOSU_INCLUDE_DIR})
2424

25-
set(_arm_baremetal_sources backends/arm/runtime/ArmBackendEthosU.cpp
25+
set(_arm_baremetal_sources backends/arm/runtime/EthosUBackend.cpp
2626
backends/arm/runtime/VelaBinStream.cpp
2727
)
2828
list(TRANSFORM _arm_baremetal_sources PREPEND "${EXECUTORCH_ROOT}/")

backends/arm/README.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ ethos-u-vela compilation stack. which follows the fully AoT flow.
1515
## Layout
1616

1717
Export:
18-
- `arm_backend.py` - Main entrypoint for the ArmPartitioner and ArmBackend. For more information see the section on
18+
- `ethosu_backend.py` - Main entrypoint for the EthosUBackend. For more information see the section on
1919
[Arm Backend Architecture](#arm-backend-architecture). For examples of use see `executorch/examples/arm`.
2020
- `tosa_mapping.py` - utilities for mapping edge dialect to TOSA
2121
- `tosa_quant_utils.py` - utilities for mapping quantization information to TOSA encoding
@@ -29,11 +29,11 @@ Passes:
2929
- `*_pass.py` - Compiler passes derived from ExportPass
3030

3131
Quantization:
32-
- `arm_quantizer.py` - Quantizer for Arm backend
32+
- `arm_quantizer.py` - Quantizers for Arm backend. Contains the EthosUQuantizer which inherits from the TOSAQuantizer
3333
- `arm_quantizer_utils.py` - Utilities for quantization
3434

3535
Runtime:
36-
- `runtime/ArmBackendEthosU.cpp` - The Arm backend implementation of the ExecuTorch runtime backend (BackendInterface) for Ethos-U
36+
- `runtime/ArmEthosUBackend.cpp` - The Arm backend implementation of the ExecuTorch runtime backend (BackendInterface) for Ethos-U
3737

3838
Other:
3939
- `third-party/` - Dependencies on other code - in particular the TOSA serialization_lib for compiling to TOSA and the ethos-u-core-driver for the bare-metal backend supporting Ethos-U
@@ -177,6 +177,7 @@ create an issue on [github](https://www.github.com/pytorch/executorch/issues).
177177
# Arm Backend Architecture
178178

179179
The broad principle with the Arm backend implemention for ExecuTorch is to support multiple Arm devices and device configurations through a largely Homogeneous flow with maximal sharing of class logic.
180+
The EthosUBackend is currently the one user facing API that target the Ethos-U55 and Ethos-U85 hardware IP. It is using the TOSABackend under the hood to share code and functionality, but also to separate testing possibilities to the TOSA flow itself.
180181

181182
In practice for compilation, this means that the flow goes via [Arm TOSA](https://www.mlplatform.org/tosa/tosa_spec.html) to produce a common IR and quantization behaviour compatible with our various IP, and typically, device-specific backends to further lower to a device specific binary which can happen ahead of time (within the Python development flow) or at runtime (during a JIT compilation stage).
182183

@@ -185,22 +186,22 @@ In practice for the runtime, this means we will share common runtime backend fun
185186

186187
## Arm Backend Status and Maturity
187188

188-
The Arm Backend should be considered a prototype quality at this point, likely subject to significant change and improvement, and with a limited coverage of functionality. We are actively developing this codebase.
189+
The Arm EthosU Backend should be considered a prototype quality at this point, likely subject to significant change and improvement, and with a limited coverage of functionality. We are actively developing this codebase.
189190

190191
## Current flows
191192

192-
The ArmBackend has a two stage process,
193-
- Compile to TOSA to rationalise the graph into known hardware support profiles. Currently this is to v0.80 TOSA BI with specific concern to a subset which gives support on Ethos-U55, the target of the initial prototype efforts.
193+
The EthosUBackend has a two stage process,
194+
- Compile to TOSA to rationalise the graph into known hardware support profiles. Currently this is to v0.80 TOSA BI with specific concern to a subset which gives support on Ethos-U55 and Ethos-U85, the target of the initial prototype efforts. This calls into the TOSABackend.
194195
- Lower via the ethos-u-vela compilation flow which takes TOSA v0.80 as an input and produces a low level commandstream for the hardware which is then passed via the delegate to the ethos-u-core-driver for direct execution.
195196

196-
The ArmPartitioner is currenly used to ensure the operations converted are Ethos-U compatible, but will be extended to offer spec-correct TOSA Base inference and TOSA Main Inference generation in future.
197+
The EthosUPartitioner is currenly used to ensure the operations converted are Ethos-U compatible, but will be extended to offer spec-correct TOSA Base inference and TOSA Main Inference generation in future.
198+
199+
There is also a generic TOSABackend with accompanying TOSAPartitioner and TOSAQuantizer, which are used by the EthosUBackend and friends. The Arm TOSA Backend can be used by it's own to verify the lowering to the TOSA representation of the model (refer to the unit tests in backends/arm/test which uses the TOSA backend in the test suites).
197200

198201
### Controlling compilation
199202

200203
It is possible to control the compilation flow to aid in development and debug of both networks and the code itself.
201204

202-
Configuration of the ArmBackend export flow is controlled by CompileSpec information (essentially used as compilation flags) to determine which of these outputs is produced. In particular this allows for use of the tosa_reference_model to run intermediate output to check for correctness and quantization accuracy without a full loop via hardware implemntation.
203-
204-
As this is in active development see the ArmBackend for accurate information on [compilation flags](https://github.com/pytorch/executorch/blob/29f6dc9353e90951ed3fae3c57ae416de0520067/backends/arm/arm_backend.py#L319-L324)
205+
Configuration of the EthosUBackend export flow is controlled by CompileSpec information (essentially used as compilation flags) to determine which of these outputs is produced. In particular this allows for use of the tosa_reference_model to run intermediate output to check for correctness and quantization accuracy without a full loop via hardware implemntation.
205206

206-
You can also refer to the [example TOSA end-to-end code](/examples/arm/arm_tosa_e2e.py)
207+
As this is in active development see the EthosUBackend for accurate information on [compilation flags](https://github.com/pytorch/executorch/blob/29f6dc9353e90951ed3fae3c57ae416de0520067/backends/arm/arm_backend.py#L319-L324)

backends/arm/arm_backend.py

Lines changed: 20 additions & 132 deletions
Original file line numberDiff line numberDiff line change
@@ -12,35 +12,16 @@
1212
#
1313

1414
import logging
15-
import os
16-
from typing import cast, final, List, Optional
1715

18-
import serializer.tosa_serializer as ts # type: ignore
19-
from executorch.backends.arm.arm_vela import vela_compile
20-
from executorch.backends.arm.operators.node_visitor import get_node_visitors
16+
from typing import List, Optional
2117

2218
from executorch.backends.arm.tosa_specification import TosaSpecification
23-
from executorch.backends.arm._passes.arm_pass_manager import (
24-
ArmPassManager,
25-
) # usort: skip
26-
from executorch.backends.arm.process_node import (
27-
process_call_function,
28-
process_output,
29-
process_placeholder,
30-
)
31-
from executorch.backends.arm.tosa_utils import dbg_fail, dbg_tosa_dump
32-
from executorch.exir.backend.backend_details import BackendDetails, PreprocessResult
19+
3320
from executorch.exir.backend.compile_spec_schema import CompileSpec
34-
from torch.export.exported_program import ExportedProgram
35-
from torch.fx import Node
3621

37-
# TOSA backend debug functionality
22+
3823
logger = logging.getLogger(__name__)
3924
logger.setLevel(logging.WARNING)
40-
TOSA_DBG_VERBOSE = os.environ.get("TOSA_DBG_VERBOSE") == "1"
41-
if TOSA_DBG_VERBOSE:
42-
logging.basicConfig(level=logging.INFO)
43-
logger.setLevel(logging.INFO)
4425

4526

4627
class ArmCompileSpecBuilder:
@@ -49,7 +30,6 @@ def __init__(self):
4930
self.compiler_flags = []
5031
self.output_format = None
5132
self.path_for_intermediates = None
52-
self.tosa_version = None
5333
self.tosa_spec = None
5434
self.input_order = None
5535

@@ -130,7 +110,7 @@ def build(self) -> List[CompileSpec]:
130110
assert self.tosa_spec
131111

132112
# Always supply a TOSA version
133-
self.compile_spec = [CompileSpec("tosa_version", str(self.tosa_spec).encode())]
113+
self.compile_spec = [CompileSpec("tosa_spec", str(self.tosa_spec).encode())]
134114

135115
if self.output_format == "vela":
136116
self.compile_spec += [
@@ -156,125 +136,33 @@ def build(self) -> List[CompileSpec]:
156136

157137

158138
def is_tosa(compile_spec: List[CompileSpec]) -> bool:
139+
has_tosa_output = False
140+
has_tosa_spec = False
141+
for spec in compile_spec:
142+
if spec.key == "output_format":
143+
has_tosa_output = spec.value.decode() == "tosa"
144+
if spec.key == "tosa_spec":
145+
has_tosa_spec = True
146+
147+
return has_tosa_output and has_tosa_spec
148+
149+
150+
def is_ethosu(compile_spec: List[CompileSpec]) -> bool:
159151
for spec in compile_spec:
160152
if spec.key == "output_format":
161-
return spec.value.decode() == "tosa"
153+
return spec.value.decode() == "vela"
162154
return False
163155

164156

165-
def get_tosa_version(compile_spec: List[CompileSpec]) -> TosaSpecification:
157+
def get_tosa_spec(compile_spec: List[CompileSpec]) -> TosaSpecification:
166158
for spec in compile_spec:
167-
if spec.key == "tosa_version":
159+
if spec.key == "tosa_spec":
168160
return TosaSpecification.create_from_string(spec.value.decode())
169-
raise RuntimeError("Could not find TOSA version in CompileSpec")
161+
raise ValueError("Could not find TOSA version in CompileSpec")
170162

171163

172164
def get_intermediate_path(compile_spec: List[CompileSpec]) -> Optional[str]:
173165
for spec in compile_spec:
174166
if spec.key == "debug_artifact_path":
175167
return spec.value.decode()
176168
return None
177-
178-
179-
def _get_first_delegation_tag(graph_module) -> str | None:
180-
"""Get the first delegation tag from the graph_module or return None."""
181-
for node in graph_module.graph.nodes:
182-
tag = node.meta.get("delegation_tag")
183-
if tag:
184-
return tag
185-
186-
logger.debug("No delegation tag found in partition.")
187-
return None
188-
189-
190-
@final
191-
class ArmBackend(BackendDetails):
192-
@staticmethod
193-
def preprocess( # noqa: C901
194-
edge_program: ExportedProgram,
195-
compile_spec: List[CompileSpec],
196-
) -> PreprocessResult:
197-
logger.info("ArmBackend::preprocess")
198-
199-
# if a debug/test build capture output files from TOSA stage
200-
artifact_path = None
201-
output_format = ""
202-
compile_flags = []
203-
input_order = []
204-
for spec in compile_spec:
205-
if spec.key == "debug_artifact_path":
206-
artifact_path = spec.value.decode()
207-
if spec.key == "output_format":
208-
output_format = spec.value.decode()
209-
if spec.key == "compile_flags":
210-
compile_flags.append(spec.value.decode())
211-
if spec.key == "input_order":
212-
input_order = list(map(int, spec.value.decode().split(",")))
213-
214-
# Check that the output format is set in the compile spec
215-
if not output_format:
216-
raise RuntimeError("output format is required")
217-
218-
tosa_spec = TosaSpecification.create_from_compilespecs(compile_spec)
219-
assert (
220-
tosa_spec is not None
221-
), "TOSA backend needs a TOSA version specified in the CompileSpec!"
222-
223-
if output_format == "vela" and len(compile_flags) == 0:
224-
# Not testing for compile_flags correctness here, just that they are
225-
# present. The compiler will give errors if they are not valid.
226-
raise RuntimeError("compile flags are required for vela output format")
227-
228-
logger.info(f"Converting ExportedProgram to TOSA: {tosa_spec}")
229-
230-
# Converted output for this subgraph, serializer needs path early as it emits
231-
# const data directly. Path created and data written only in debug builds.
232-
tosa_graph = ts.TosaSerializer(artifact_path)
233-
graph_module = ArmPassManager(tosa_spec).transform_to_backend_pipeline( # type: ignore
234-
exported_program=edge_program
235-
)
236-
237-
node_visitors = get_node_visitors(edge_program, tosa_spec)
238-
input_count = 0
239-
for node in graph_module.graph.nodes:
240-
node = cast(Node, node)
241-
if node.op == "call_function":
242-
process_call_function(node, tosa_graph, node_visitors, tosa_spec)
243-
elif node.op == "placeholder":
244-
process_placeholder(node, tosa_graph, edge_program, tosa_spec)
245-
if node.name in edge_program.graph_signature.user_inputs:
246-
input_count += 1
247-
elif node.op == "output":
248-
process_output(node, tosa_graph)
249-
else:
250-
# This will only happen if an unpartitioned graph is passed without
251-
# any checking of compatibility.
252-
dbg_fail(node, tosa_graph, artifact_path)
253-
254-
if len(input_order) > 0:
255-
if input_count != len(input_order):
256-
raise RuntimeError(
257-
"The rank of the input order is not equal to amount of input tensors"
258-
)
259-
260-
if artifact_path:
261-
tag = _get_first_delegation_tag(graph_module)
262-
dbg_tosa_dump(
263-
tosa_graph,
264-
artifact_path,
265-
suffix="{}".format(f"_{tag}" if tag else ""),
266-
)
267-
268-
# Serialize and return the program. While we have always produced TOSA
269-
# output as an intermediate, some flows compile to device binaries in
270-
# preprocess and some consume TOSA fb directly.
271-
if output_format == "vela":
272-
# Emit vela_bin_stream format
273-
binary = vela_compile(tosa_graph, compile_flags, input_order)
274-
elif output_format == "tosa":
275-
# Emit TOSA flatbuffer
276-
binary = bytes(tosa_graph.serialize())
277-
else:
278-
raise RuntimeError(f"Unknown format {output_format}")
279-
280-
return PreprocessResult(processed_bytes=binary)

backends/arm/arm_vela.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,12 @@ def vela_bin_pack_io(prefix, data, shape_order=None):
3939
# Output via Vela to binary stream for ArmBackendEthosU
4040
# WARNING: Do not change this without changing VelaBinStream.cpp as that
4141
# function consumes this format and the two need to align.
42-
def vela_compile(tosa_graph, args: List[str], shape_order=None):
42+
def vela_compile(tosa_flatbuffer: bytes, args: List[str], shape_order=None):
4343
with tempfile.TemporaryDirectory() as tmpdir:
4444
tosaname = "out.tosa"
45-
flatbuffer = tosa_graph.serialize()
4645
tosa_path = os.path.join(tmpdir, tosaname)
4746
with open(tosa_path, "wb") as f:
48-
f.write(flatbuffer)
47+
f.write(tosa_flatbuffer)
4948

5049
# invoke vela
5150
output_dir = os.path.join(tmpdir, "output")

backends/arm/ethosu_backend.py

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Copyright 2025 Arm Limited and/or its affiliates.
2+
#
3+
# This source code is licensed under the BSD-style license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
6+
# pyre-unsafe
7+
8+
#
9+
# Main implementation of AoT flow to partition and preprocess for Arm target
10+
# backends. Converts via TOSA as an intermediate form supported by AoT and
11+
# JIT compiler flows.
12+
#
13+
14+
import logging
15+
from typing import final, List
16+
17+
from executorch.backends.arm.arm_vela import vela_compile
18+
19+
from executorch.backends.arm.tosa_backend import TOSABackend
20+
from executorch.exir.backend.backend_details import BackendDetails, PreprocessResult
21+
from executorch.exir.backend.compile_spec_schema import CompileSpec
22+
from torch.export.exported_program import ExportedProgram
23+
24+
# debug functionality
25+
logger = logging.getLogger(__name__)
26+
logger.setLevel(logging.WARNING)
27+
28+
29+
@final
30+
class EthosUBackend(BackendDetails):
31+
"""
32+
BackendDetails subclass for delegation to Ethos-U. Deduce the TOSA lowering from
33+
the compile spec list by filtering out the compile spec values that are of interest
34+
for the TOSABackend.
35+
"""
36+
37+
@staticmethod
38+
def _compile_tosa_flatbuffer(
39+
tosa_flatbuffer: bytes, compile_spec: list[CompileSpec]
40+
) -> bytes:
41+
"""
42+
Static helper method to do the compilation of the TOSA flatbuffer
43+
representation to a target specific binary stream.
44+
"""
45+
compile_flags = []
46+
input_order = []
47+
for spec in compile_spec:
48+
if spec.key == "compile_flags":
49+
compile_flags.append(spec.value.decode())
50+
if spec.key == "input_order":
51+
input_order = list(map(int, spec.value.decode().split(",")))
52+
53+
if len(compile_flags) == 0:
54+
# Not testing for compile_flags correctness here, just that they are
55+
# present. The compiler will give errors if they are not valid.
56+
raise RuntimeError(
57+
"compile_flags are required in the CompileSpec list for EthosUBackend"
58+
)
59+
60+
# Pass on the TOSA flatbuffer to the vela compiler.
61+
binary = vela_compile(tosa_flatbuffer, compile_flags, input_order)
62+
return binary
63+
64+
@staticmethod
65+
def preprocess(
66+
edge_program: ExportedProgram,
67+
compile_spec: List[CompileSpec],
68+
) -> PreprocessResult:
69+
logger.info(f"{EthosUBackend.__name__} preprocess")
70+
71+
# deduce TOSA compile_spec from Ethos-U compile spec. We get a new
72+
# compile spec list, containing only elements relevant for the
73+
# TOSABackend.
74+
tosa_compile_spec = TOSABackend.filter_tosa_compile_specs(compile_spec)
75+
76+
# Backends doesn't allow inheritance, as stated in comments in exir/backend/backend_api.py
77+
# ('All backend implementation are final...'), so use composition instead.
78+
# preprocess returns the serialized TOSA flatbuffer in .processed_bytes,
79+
# which can be passed on to next compilation step.
80+
tosa_preprocess = TOSABackend.preprocess(edge_program, tosa_compile_spec)
81+
82+
binary = EthosUBackend._compile_tosa_flatbuffer(
83+
tosa_preprocess.processed_bytes, compile_spec
84+
)
85+
86+
return PreprocessResult(processed_bytes=binary)

0 commit comments

Comments
 (0)