Skip to content

Commit a4b05dd

Browse files
committed
Update base for Update on "[ET-VK] Adding batch processing in x axis to conv2d dw shader by caching input texel for reuse."
This diff adds batch processing in the x axis to the conv2d dw shader by reusing input texel overlapping between consecutive tiles. The changes include modifying the glsl code for the conv2d dw output tile, adding a new parameter to the yaml file, and modifying the Convolution.cpp file to use the new parameter. Differential Revision: [D67868671](https://our.internmc.facebook.com/intern/diff/D67868671/) [ghstack-poisoned]
2 parents 2f14536 + 39e8538 commit a4b05dd

39 files changed

+1185
-1129
lines changed
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/bin/bash
2+
# Copyright 2024 Arm Limited and/or its affiliates.
3+
#
4+
# This source code is licensed under the BSD-style license found in the
5+
# LICENSE file in the root directory of this source tree.
6+
7+
# NB: This function could be used to install Arm dependencies
8+
# Setup arm example environment (including TOSA tools)
9+
git config --global user.email "[email protected]"
10+
git config --global user.name "Github Executorch"
11+
bash examples/arm/setup.sh --i-agree-to-the-contained-eula

.ci/scripts/utils.sh

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -59,17 +59,6 @@ install_flatc_from_source() {
5959
popd || return
6060
}
6161

62-
install_arm() {
63-
# NB: This function could be used to install Arm dependencies
64-
# Setup arm example environment (including TOSA tools)
65-
git config --global user.email "[email protected]"
66-
git config --global user.name "Github Executorch"
67-
bash examples/arm/setup.sh --i-agree-to-the-contained-eula
68-
69-
# Test tosa_reference flow
70-
source examples/arm/ethos-u-scratch/setup_path.sh
71-
}
72-
7362
build_executorch_runner_buck2() {
7463
# Build executorch runtime with retry as this step is flaky on macos CI
7564
retry buck2 build //examples/portable/executor_runner:executor_runner

.github/workflows/pull.yml

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -354,13 +354,11 @@ jobs:
354354
EXECUTORCH_BUILD_ARM_BAREMETAL=ON \
355355
.ci/scripts/setup-linux.sh "${BUILD_TOOL}"
356356
357-
source .ci/scripts/utils.sh
358357
# Install Arm dependencies
359-
install_arm
360-
361-
# Run pytest with coverage
362-
pytest -c /dev/null -v -n auto --cov=./ --cov-report=xml backends/arm/test
358+
.ci/scripts/setup-arm-baremetal-tools.sh
363359
360+
# Run pytest without simulator
361+
backends/arm/test/test_arm_baremetal.sh test_pytest
364362
365363
test-llama-runner-qnn-linux:
366364
name: test-llama-runner-qnn-linux

.github/workflows/trunk.yml

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,14 +146,15 @@ jobs:
146146
source .ci/scripts/utils.sh
147147
install_executorch
148148
149-
install_arm
149+
.ci/scripts/setup-arm-baremetal-tools.sh
150150
151151
# Increase number of files user can monitor to bypass buck failures.
152152
# Hopefully this is high enough for this setup.
153153
sudo sysctl fs.inotify.max_user_watches=1048576 # 1024 * 1024
154154
155155
# Test ethos-u delegate examples with run.sh
156-
PYTHON_EXECUTABLE=python bash examples/arm/run.sh examples/arm/ethos-u-scratch/
156+
backends/arm/test/test_arm_baremetal.sh test_run_ethosu_fvp
157+
157158
158159
test-arm-reference-delegation:
159160
name: test-arm-reference-delegation
@@ -172,10 +173,10 @@ jobs:
172173
source .ci/scripts/utils.sh
173174
install_executorch
174175
175-
install_arm
176+
.ci/scripts/setup-arm-baremetal-tools.sh
176177
177-
# Run arm unit tests
178-
pytest -c /dev/null -v -n auto --cov=./ --cov-report=xml backends/arm/test
178+
# Run arm unit tests using the simulator
179+
backends/arm/test/test_arm_baremetal.sh test_pytest_ethosu_fvp
179180
180181
test-coreml-delegate:
181182
name: test-coreml-delegate

backends/arm/README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,28 @@ Other:
3939
- `third-party/` - Dependencies on other code - in particular the TOSA serialization_lib for compiling to TOSA and the ethos-u-core-driver for the bare-metal backend supporting Ethos-U
4040
- `test/` - Unit test and test support functions
4141

42+
## Testing
43+
44+
After a setup you can run unit tests with the test_arm_baremetal.sh script.
45+
46+
To run the pytests suite run
47+
48+
```
49+
backends/arm/test/test_arm_baremetal.sh test_pytest
50+
```
51+
52+
To run the unit test suite with Corstone3x0 FVP simulator support use
53+
54+
```
55+
backends/arm/test/test_arm_baremetal.sh test_pytest_ethosu_fvp
56+
```
57+
58+
You can test to run some models with the run.sh flow
59+
60+
```
61+
backends/arm/test/test_arm_baremetal.sh test_run_ethosu_fvp
62+
```
63+
4264
## Unit tests
4365
This is the structure of the test directory
4466

@@ -51,6 +73,8 @@ test # Root test folder
5173
├── tester # Arm Tester class
5274
├── tosautil # Utility functions for TOSA artifacts
5375
├ common.py # Common functions and definitions used by many tests
76+
├ setup_testing.sh # Script to prepare testing for using the Corstone 3x0 FVP
77+
├ test_arm_baremetal.sh # Help script to trigger testing
5478
```
5579

5680
Some example commands to run these tests follow. Run a single test:
@@ -59,6 +83,12 @@ Some example commands to run these tests follow. Run a single test:
5983
python -m unittest backends.arm.test.ops.test_add.TestSimpleAdd -k test_add2_tosa_BI
6084
```
6185

86+
or with pytest
87+
88+
```
89+
pytest -c /dev/null -v -n auto backends/arm/test/ops/test_add.py -k test_add2_tosa_BI
90+
```
91+
6292
Or all tests in "TestSimpleAdd":
6393

6494
```
@@ -71,6 +101,28 @@ Or discover and run many tests:
71101
python -m unittest discover -s backends/arm/test/ops/
72102
```
73103

104+
or with pytest
105+
106+
```
107+
pytest -c /dev/null -v -n auto backends/arm/test/ops/
108+
```
109+
110+
111+
You can run tests using Corstone3x0 simulators to see how it would work on something more target like
112+
first you need to build and prepare some used target libs
113+
114+
```
115+
examples/arm/run.sh --model_name=add --build_only
116+
backends/arm/test/setup_testing.sh
117+
```
118+
119+
The you can run the tests with
120+
121+
```
122+
pytest -c /dev/null -v -n auto backends/arm/test --arm_quantize_io --arm_run_corstoneFVP
123+
```
124+
125+
74126
### A note on unit tests
75127

76128
There are currently 3 ways we unit test our code.

backends/arm/quantizer/TARGETS

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,22 @@ python_library(
55
srcs = ["arm_quantizer.py"],
66
deps = [
77
":arm_quantizer_utils",
8+
":quantization_annotator",
89
"//caffe2:torch",
9-
"//executorch/backends/arm/quantizer/quantization_annotation:quantization_annotation",
1010
"//executorch/exir:lib",
1111
],
1212
)
1313

14+
python_library(
15+
name = "quantization_annotator",
16+
srcs = ["quantization_annotator.py"],
17+
deps = [
18+
":arm_quantizer_utils",
19+
":quantization_config",
20+
"//caffe2:torch",
21+
],
22+
)
23+
1424
python_library(
1525
name = "quantization_config",
1626
srcs = ["quantization_config.py"],

backends/arm/quantizer/arm_quantizer.py

Lines changed: 10 additions & 113 deletions
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,16 @@
1313

1414
from __future__ import annotations
1515

16-
import copy
1716
import functools
18-
from typing import Any, Callable, Dict, List, Optional, Set
17+
from typing import Any, Callable, Dict, List, Optional
1918

2019
import torch
21-
import torch.nn.functional as F
2220
from executorch.backends.arm._passes.arm_pass_manager import ArmPassManager
2321

2422
from executorch.backends.arm.quantizer import arm_quantizer_utils
25-
from executorch.backends.arm.quantizer.arm_quantizer_utils import (
26-
mark_nodes_as_annotated,
27-
propagate_annotation,
28-
)
29-
from executorch.backends.arm.quantizer.quantization_annotation import (
30-
OP_TO_ANNOTATOR,
31-
OperatorConfig,
32-
OperatorPatternType,
33-
)
23+
from executorch.backends.arm.quantizer.arm_quantizer_utils import mark_node_as_annotated
24+
from executorch.backends.arm.quantizer.quantization_annotator import annotate_graph
25+
3426
from executorch.backends.arm.quantizer.quantization_config import QuantizationConfig
3527
from torch.ao.quantization.fake_quantize import (
3628
FakeQuantize,
@@ -58,44 +50,6 @@
5850
]
5951

6052

61-
def _supported_symmetric_quantized_operators() -> Dict[str, List[OperatorPatternType]]:
62-
supported_operators: Dict[str, List[OperatorPatternType]] = {
63-
# Both conv and linear should be able to handle relu + hardtanh fusion since
64-
# those are clamp ops
65-
"conv2d": [
66-
[torch.nn.Conv2d, torch.nn.ReLU],
67-
[torch.nn.Conv2d, F.relu],
68-
[F.conv2d, torch.nn.ReLU],
69-
[F.conv2d, F.relu],
70-
],
71-
"linear": [[torch.nn.Linear], [F.linear]],
72-
"add": [[torch.add]],
73-
"max_pool2d": [[torch.nn.MaxPool2d], [F.max_pool2d]],
74-
"adaptive_avg_pool2d": [
75-
[torch.nn.AdaptiveAvgPool2d],
76-
[F.adaptive_avg_pool2d],
77-
],
78-
"mul": [[torch.mul]],
79-
"sub": [[torch.sub]],
80-
"min_max": [[torch.min], [torch.max]],
81-
}
82-
return copy.deepcopy(supported_operators)
83-
84-
85-
def _get_supported_symmetric_config_and_operators() -> List[OperatorConfig]:
86-
supported_config_and_operators: List[OperatorConfig] = []
87-
for quantization_config in [
88-
get_symmetric_quantization_config(),
89-
get_symmetric_quantization_config(is_per_channel=True),
90-
]:
91-
ops = _supported_symmetric_quantized_operators()
92-
for pattern_list in ops.values():
93-
supported_config_and_operators.append(
94-
OperatorConfig(quantization_config, pattern_list)
95-
)
96-
return copy.deepcopy(supported_config_and_operators)
97-
98-
9953
@functools.lru_cache
10054
def get_symmetric_quantization_config(
10155
is_per_channel: bool = False,
@@ -180,10 +134,6 @@ def get_symmetric_quantization_config(
180134
return quantization_config
181135

182136

183-
def _get_supported_config_and_operators() -> List[OperatorConfig]:
184-
return _get_supported_symmetric_config_and_operators()
185-
186-
187137
NodeFilterType = Callable[[Node], bool]
188138
"""Type for a Node Filter used by annotators. A Node filter is a function that takes
189139
a Node and returns whether the node should be annotated or not.
@@ -255,26 +205,6 @@ def not_module_type_or_name_filter(n: Node) -> bool:
255205

256206

257207
class ArmQuantizer(Quantizer):
258-
supported_config_and_operators = _get_supported_config_and_operators()
259-
260-
# A list of supported static quantization annotators, in order of application.
261-
# For example, fusions come before singular ops.
262-
# The name must match the name used when registering the annotator.
263-
STATIC_ANNOTATION_ORDER = [
264-
"linear",
265-
"conv",
266-
"adaptive_avg_pool2d",
267-
"max_pool2d",
268-
"add",
269-
"sub",
270-
"mul",
271-
"min_max",
272-
"mm",
273-
"one_to_one",
274-
"generic",
275-
"upsample_nearest2d",
276-
]
277-
278208
def __init__(self) -> None:
279209
super().__init__()
280210
self.global_config: Optional[QuantizationConfig] = None
@@ -331,7 +261,6 @@ def annotate(self, model: GraphModule) -> GraphModule:
331261
The annotated model.
332262
"""
333263
model = self._annotate_for_static_quantization_config(model)
334-
propagate_annotation(model)
335264
return model
336265

337266
def _annotate_all_static_patterns(
@@ -353,8 +282,7 @@ def _annotate_all_static_patterns(
353282
if quantization_config is None:
354283
return model
355284

356-
for op in self.STATIC_ANNOTATION_ORDER:
357-
OP_TO_ANNOTATOR[op](model, quantization_config, filter_fn)
285+
annotate_graph(model, quantization_config, filter_fn)
358286
return model
359287

360288
def _annotate_for_static_quantization_config(
@@ -363,6 +291,9 @@ def _annotate_for_static_quantization_config(
363291
"""Matches the correct QuantizationConfig with the correct module using a filter
364292
when running _annotate_all_static_patterns.
365293
"""
294+
if self.io_config:
295+
self._annotate_io(model, self.io_config)
296+
366297
module_name_list = list(self.module_name_config.keys())
367298
for module_name, config in self.module_name_config.items():
368299
self._annotate_all_static_patterns(
@@ -381,9 +312,6 @@ def _annotate_for_static_quantization_config(
381312
_get_not_module_type_or_name_filter(tp_list, module_name_list),
382313
)
383314

384-
if self.io_config:
385-
self._annotate_io(model, self.io_config)
386-
387315
return model
388316

389317
def _annotate_io(
@@ -399,44 +327,13 @@ def _annotate_io(
399327
node,
400328
quantization_config.get_output_act_qspec(),
401329
)
402-
mark_nodes_as_annotated([node])
330+
mark_node_as_annotated(node)
403331
if node.op == "output":
404332
parent = node.all_input_nodes[0]
405333
_annotate_input_qspec_map(
406334
node, parent, quantization_config.get_input_act_qspec()
407335
)
408-
mark_nodes_as_annotated([node])
336+
mark_node_as_annotated(node)
409337

410338
def validate(self, model: GraphModule) -> None:
411339
pass
412-
413-
@classmethod
414-
def get_supported_operators(cls) -> List[OperatorConfig]:
415-
return cls.supported_config_and_operators
416-
417-
@classmethod
418-
def get_supported_quantization_configs(cls) -> List[QuantizationConfig]:
419-
op_configs: Set[QuantizationConfig] = set({})
420-
for spec, _ in cls.supported_config_and_operators:
421-
op_configs.add(spec)
422-
return list(op_configs)
423-
424-
@classmethod
425-
def get_supported_operator_for_quantization_config(
426-
cls, quantization_config: Optional[QuantizationConfig]
427-
) -> List[OperatorPatternType]:
428-
if quantization_config is None:
429-
all_ops = []
430-
for _, ops in cls.supported_config_and_operators:
431-
all_ops.extend(ops)
432-
return all_ops
433-
434-
for config, ops in cls.supported_config_and_operators:
435-
# note: this assumes each entry in cls.supported_spec_and_operators
436-
# corresponds to one spec, e.g. we don't have
437-
# [(spec1, op_list1), (spec1, op_list2), (spec2, op_list3)]
438-
# where the first and second entry have the same spec but did not
439-
# merge the op list
440-
if config == quantization_config:
441-
return ops
442-
return []

0 commit comments

Comments
 (0)