Skip to content

Commit ff35913

Browse files
Merge remote-tracking branch 'origin/upstream/main' into HEAD
Change-Id: I6ec014be7a76b17d64c746ee1c1e981923f4ff3f
2 parents bf23874 + 08770b7 commit ff35913

File tree

166 files changed

+2617
-2368
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

166 files changed

+2617
-2368
lines changed

.ci/docker/ci_commit_pins/buck2.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2024-05-15
1+
2024-12-16
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/bin/bash
2+
# Copyright 2024 Arm Limited and/or its affiliates.
3+
#
4+
# This source code is licensed under the BSD-style license found in the
5+
# LICENSE file in the root directory of this source tree.
6+
7+
# NB: This function could be used to install Arm dependencies
8+
# Setup arm example environment (including TOSA tools)
9+
git config --global user.email "[email protected]"
10+
git config --global user.name "Github Executorch"
11+
bash examples/arm/setup.sh --i-agree-to-the-contained-eula

.ci/scripts/utils.sh

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -59,17 +59,6 @@ install_flatc_from_source() {
5959
popd || return
6060
}
6161

62-
install_arm() {
63-
# NB: This function could be used to install Arm dependencies
64-
# Setup arm example environment (including TOSA tools)
65-
git config --global user.email "[email protected]"
66-
git config --global user.name "Github Executorch"
67-
bash examples/arm/setup.sh --i-agree-to-the-contained-eula
68-
69-
# Test tosa_reference flow
70-
source examples/arm/ethos-u-scratch/setup_path.sh
71-
}
72-
7362
build_executorch_runner_buck2() {
7463
# Build executorch runtime with retry as this step is flaky on macos CI
7564
retry buck2 build //examples/portable/executor_runner:executor_runner

.github/workflows/pull.yml

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -354,13 +354,11 @@ jobs:
354354
EXECUTORCH_BUILD_ARM_BAREMETAL=ON \
355355
.ci/scripts/setup-linux.sh "${BUILD_TOOL}"
356356
357-
source .ci/scripts/utils.sh
358357
# Install Arm dependencies
359-
install_arm
360-
361-
# Run pytest with coverage
362-
pytest -c /dev/null -v -n auto --cov=./ --cov-report=xml backends/arm/test
358+
.ci/scripts/setup-arm-baremetal-tools.sh
363359
360+
# Run pytest without simulator
361+
backends/arm/test/test_arm_baremetal.sh test_pytest
364362
365363
test-llama-runner-qnn-linux:
366364
name: test-llama-runner-qnn-linux

.github/workflows/trunk.yml

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,14 +146,15 @@ jobs:
146146
source .ci/scripts/utils.sh
147147
install_executorch
148148
149-
install_arm
149+
.ci/scripts/setup-arm-baremetal-tools.sh
150150
151151
# Increase number of files user can monitor to bypass buck failures.
152152
# Hopefully this is high enough for this setup.
153153
sudo sysctl fs.inotify.max_user_watches=1048576 # 1024 * 1024
154154
155155
# Test ethos-u delegate examples with run.sh
156-
PYTHON_EXECUTABLE=python bash examples/arm/run.sh examples/arm/ethos-u-scratch/
156+
backends/arm/test/test_arm_baremetal.sh test_run_ethosu_fvp
157+
157158
158159
test-arm-reference-delegation:
159160
name: test-arm-reference-delegation
@@ -172,10 +173,10 @@ jobs:
172173
source .ci/scripts/utils.sh
173174
install_executorch
174175
175-
install_arm
176+
.ci/scripts/setup-arm-baremetal-tools.sh
176177
177-
# Run arm unit tests
178-
pytest -c /dev/null -v -n auto --cov=./ --cov-report=xml backends/arm/test
178+
# Run arm unit tests using the simulator
179+
backends/arm/test/test_arm_baremetal.sh test_pytest_ethosu_fvp
179180
180181
test-coreml-delegate:
181182
name: test-coreml-delegate

backends/arm/README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,28 @@ Other:
3939
- `third-party/` - Dependencies on other code - in particular the TOSA serialization_lib for compiling to TOSA and the ethos-u-core-driver for the bare-metal backend supporting Ethos-U
4040
- `test/` - Unit test and test support functions
4141

42+
## Testing
43+
44+
After a setup you can run unit tests with the test_arm_baremetal.sh script.
45+
46+
To run the pytests suite run
47+
48+
```
49+
backends/arm/test/test_arm_baremetal.sh test_pytest
50+
```
51+
52+
To run the unit test suite with Corstone3x0 FVP simulator support use
53+
54+
```
55+
backends/arm/test/test_arm_baremetal.sh test_pytest_ethosu_fvp
56+
```
57+
58+
You can test to run some models with the run.sh flow
59+
60+
```
61+
backends/arm/test/test_arm_baremetal.sh test_run_ethosu_fvp
62+
```
63+
4264
## Unit tests
4365
This is the structure of the test directory
4466

@@ -51,6 +73,8 @@ test # Root test folder
5173
├── tester # Arm Tester class
5274
├── tosautil # Utility functions for TOSA artifacts
5375
├ common.py # Common functions and definitions used by many tests
76+
├ setup_testing.sh # Script to prepare testing for using the Corstone 3x0 FVP
77+
├ test_arm_baremetal.sh # Help script to trigger testing
5478
```
5579

5680
Some example commands to run these tests follow. Run a single test:
@@ -59,6 +83,12 @@ Some example commands to run these tests follow. Run a single test:
5983
python -m unittest backends.arm.test.ops.test_add.TestSimpleAdd -k test_add2_tosa_BI
6084
```
6185

86+
or with pytest
87+
88+
```
89+
pytest -c /dev/null -v -n auto backends/arm/test/ops/test_add.py -k test_add2_tosa_BI
90+
```
91+
6292
Or all tests in "TestSimpleAdd":
6393

6494
```
@@ -71,6 +101,27 @@ Or discover and run many tests:
71101
python -m unittest discover -s backends/arm/test/ops/
72102
```
73103

104+
or with pytest
105+
106+
```
107+
pytest -c /dev/null -v -n auto backends/arm/test/ops/
108+
```
109+
110+
111+
You can run tests using Corstone3x0 simulators to see how it would work on something more target like
112+
first you need to build and prepare some used target libs
113+
114+
```
115+
examples/arm/run.sh --model_name=add --build_only
116+
backends/arm/test/setup_testing.sh
117+
```
118+
119+
The you can run the tests with
120+
121+
```
122+
pytest -c /dev/null -v -n auto backends/arm/test --arm_quantize_io --arm_run_corstoneFVP
123+
```
124+
74125
### Code coverage
75126

76127
To get code coverage:
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# Copyright 2024 Arm Limited and/or its affiliates.
2+
# All rights reserved.
3+
#
4+
# This source code is licensed under the BSD-style license found in the
5+
# LICENSE file in the root directory of this source tree.
6+
7+
import itertools
8+
9+
import torch
10+
from executorch.backends.arm._passes.arm_pass_utils import create_node
11+
from executorch.backends.arm.tosa_quant_utils import dq_op, q_op
12+
from executorch.exir.dialects._ops import ops as exir_ops
13+
from executorch.exir.pass_base import ExportPass, PassResult
14+
from torch.fx import GraphModule
15+
from torch.fx.passes.utils.source_matcher_utils import get_source_partitions
16+
17+
18+
class AnnotateDecomposedMatmulPass(ExportPass):
19+
"""
20+
torch.matmul can be decomposed in many ways, for instance:
21+
dq -> matmul -> q can become
22+
dq -> repeat -> view -> bmm -> view -> dq which makes quantization folding
23+
difficult. This helper function find all matmul partitions and annotate its
24+
matmul-op (can be mm or bmm).
25+
"""
26+
27+
def call(self, graph_module: GraphModule) -> PassResult:
28+
matmul_partitions = get_source_partitions(
29+
graph_module.graph,
30+
[
31+
torch.matmul,
32+
],
33+
None,
34+
)
35+
matmul_partitions = list(
36+
itertools.chain.from_iterable(matmul_partitions.values())
37+
)
38+
matmul_targets = {
39+
exir_ops.edge.aten.mm.default,
40+
exir_ops.edge.aten.bmm.default,
41+
}
42+
for partition in matmul_partitions:
43+
quantized_input = all(
44+
input_node.target == dq_op for input_node in partition.input_nodes
45+
)
46+
matmul_node = [
47+
node for node in partition.nodes if node.target in matmul_targets
48+
][0]
49+
if quantized_input:
50+
matmul_args = matmul_node.all_input_nodes
51+
for i in range(len(matmul_args)):
52+
input_node = partition.input_nodes[i]
53+
matmul_input_node = matmul_args[i]
54+
# Remove partition input dq-node
55+
input_node.replace_all_uses_with(input_node.all_input_nodes[0])
56+
graph_module.graph.erase_node(input_node)
57+
input_node_qargs = input_node.args[1:]
58+
with graph_module.graph.inserting_before(matmul_node):
59+
# Create new dq-node before matmul
60+
dq_node = create_node(
61+
graph=graph_module.graph,
62+
op_target=dq_op,
63+
)
64+
dq_node.args = (matmul_input_node, *input_node_qargs)
65+
matmul_node.replace_input_with(matmul_input_node, dq_node)
66+
67+
partition_output = list(partition.output_nodes[0].users)[0]
68+
quantized_output = partition_output.target == q_op
69+
if quantized_output:
70+
output_node_qargs = partition_output.args[1:]
71+
with graph_module.graph.inserting_after(matmul_node):
72+
# Create q-node after matmul
73+
q_node = create_node(
74+
graph=graph_module.graph,
75+
op_target=q_op,
76+
)
77+
matmul_node.replace_all_uses_with(q_node)
78+
q_node.args = (matmul_node, *output_node_qargs)
79+
# Remove partition output q-node
80+
partition_output.replace_all_uses_with(
81+
partition_output.all_input_nodes[0]
82+
)
83+
graph_module.graph.erase_node(partition_output)
84+
85+
# retrace the graph to update the fake tensor types
86+
graph_module = super().call(graph_module).graph_module
87+
88+
graph_module.recompile()
89+
return PassResult(graph_module, True)

backends/arm/_passes/arm_pass_manager.py

Lines changed: 46 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@
1111
from executorch.backends.arm._passes.annotate_channels_last_dim_order_pass import (
1212
AnnotateChannelsLastDimOrder,
1313
)
14+
from executorch.backends.arm._passes.annotate_decomposed_matmul import (
15+
AnnotateDecomposedMatmulPass,
16+
)
1417
from executorch.backends.arm._passes.cast_int64_pass import CastInt64ToInt32Pass
1518
from executorch.backends.arm._passes.conv1d_unsqueeze_pass import Conv1dUnsqueezePass
1619
from executorch.backends.arm._passes.convert_expand_copy_to_repeat import (
@@ -32,7 +35,9 @@
3235
from executorch.backends.arm._passes.fold_qdq_with_annotated_qparams_pass import (
3336
FoldAndAnnotateQParamsPass,
3437
QuantizeFullArgument,
38+
RetraceFoldedDtypesPass,
3539
)
40+
from executorch.backends.arm._passes.insert_table_ops import InsertTableOpsPass
3641
from executorch.backends.arm._passes.keep_dims_false_to_squeeze_pass import (
3742
KeepDimsFalseToSqueezePass,
3843
)
@@ -67,24 +72,15 @@ def transform_to_backend_pipeline(
6772
self, exported_program: ExportedProgram, compile_spec: list[CompileSpec]
6873
):
6974
"""Apply passes before transforming program to backend"""
70-
self.add_pass(CastInt64ToInt32Pass(exported_program))
75+
self.add_pass(DecomposeLinearPass())
7176
self.add_pass(RemoveGetItemPass())
72-
self.add_pass(UnsqueezeScalarPlaceholdersPass(exported_program))
73-
self.add_pass(SizeAdjustConv2DPass())
74-
self.add_pass(RemoveClonePass())
75-
self.add_pass(ConvertExpandCopyToRepeatPass())
7677
self.add_pass(DecomposeLayerNormPass())
77-
self.add_pass(UnsqueezeBeforeRepeatPass())
7878
self.add_pass(DecomposeVarPass())
7979
self.add_pass(ConvertMeanDimToAveragePool())
8080
self.add_pass(DecomposeMeanDimPass())
81-
self.add_pass(MatchArgRanksPass(exported_program))
82-
self.add_pass(DecomposeDivPass())
83-
self.add_pass(KeepDimsFalseToSqueezePass())
8481
self.add_pass(ConvertSplitToSlicePass())
85-
self.add_pass(Conv1dUnsqueezePass(exported_program))
86-
self.add_pass(DecomposeSoftmaxesPass())
87-
self.add_pass(DecomposeLinearPass())
82+
# TODO MLETORCH-558
83+
self.add_pass(AnnotateDecomposedMatmulPass())
8884
self.add_pass(QuantizeFullArgument())
8985
self.add_pass(
9086
FoldAndAnnotateQParamsPass(
@@ -93,11 +89,49 @@ def transform_to_backend_pipeline(
9389
exir_ops.edge.aten.maximum.default,
9490
exir_ops.edge.aten.add.Tensor,
9591
exir_ops.edge.aten.avg_pool2d.default,
92+
exir_ops.edge.aten.bmm.default,
93+
exir_ops.edge.aten.cat.default,
9694
exir_ops.edge.aten.convolution.default,
95+
exir_ops.edge.aten.clone.default,
96+
exir_ops.edge.aten.exp.default,
97+
exir_ops.edge.aten.expand_copy.default,
9798
exir_ops.edge.aten.full.default,
99+
exir_ops.edge.aten.hardtanh.default,
100+
exir_ops.edge.aten.log.default,
101+
exir_ops.edge.aten.max_pool2d.default,
102+
exir_ops.edge.aten.mm.default,
103+
exir_ops.edge.aten.mul.Tensor,
104+
exir_ops.edge.aten.permute_copy.default,
105+
exir_ops.edge.aten.reciprocal.default,
106+
exir_ops.edge.aten.relu.default,
107+
exir_ops.edge.aten.repeat.default,
108+
exir_ops.edge.aten.rsqrt.default,
109+
exir_ops.edge.aten.select_copy.int,
110+
exir_ops.edge.aten.sigmoid.default,
111+
exir_ops.edge.aten.slice_copy.Tensor,
112+
exir_ops.edge.aten.squeeze_copy.dims,
113+
exir_ops.edge.aten.sub.Tensor,
114+
exir_ops.edge.aten.sum.dim_IntList,
115+
exir_ops.edge.aten.tanh.default,
116+
exir_ops.edge.aten.unsqueeze_copy.default,
117+
exir_ops.edge.aten.upsample_nearest2d.vec,
118+
exir_ops.edge.aten.view_copy.default,
98119
]
99120
)
100121
)
122+
self.add_pass(RetraceFoldedDtypesPass())
123+
self.add_pass(InsertTableOpsPass(exported_program))
124+
self.add_pass(ConvertExpandCopyToRepeatPass())
125+
self.add_pass(UnsqueezeBeforeRepeatPass())
126+
self.add_pass(CastInt64ToInt32Pass(exported_program))
127+
self.add_pass(UnsqueezeScalarPlaceholdersPass(exported_program))
128+
self.add_pass(SizeAdjustConv2DPass())
129+
self.add_pass(RemoveClonePass())
130+
self.add_pass(MatchArgRanksPass(exported_program))
131+
self.add_pass(DecomposeDivPass())
132+
self.add_pass(KeepDimsFalseToSqueezePass())
133+
self.add_pass(Conv1dUnsqueezePass(exported_program))
134+
self.add_pass(DecomposeSoftmaxesPass())
101135
for spec in compile_spec:
102136
if spec.key == "permute_memory_format":
103137
memory_format = spec.value.decode()

0 commit comments

Comments
 (0)