Skip to content

Fix xnnpack demo #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 0 additions & 22 deletions examples/backend/README

This file was deleted.

33 changes: 33 additions & 0 deletions examples/backend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
This README gives some examples on backend-specific model workflow.

# XNNPACK Backend

[XNNPACK](https://github.com/google/XNNPACK) is a library of optimized of neural network inference operators for ARM and x86 platforms. Our delegate lowers models to run using these highly optimized CPU operators. You can try out lowering and running some example models using the following commands:

## XNNPACK delegation-only

The following command will produce an floating-point XNNPACK delegated model `mv2_xnnpack_fp32.pte` that can be run using XNNPACK's operators. It will also print out the lowered graph, showing what parts of the models have been lowered to XNNPACK via `executorch_call_delegate`.

```bash
# For MobileNet V2
python3 -m examples.backend.xnnpack_examples --model_name="mv2" --delegate
```

Once we have the model binary (pte) file, then let's run it with Executorch runtime using the `xnn_executor_runner`.

```bash
buck2 run examples/backend:xnn_executor_runner -- --model_path ./mv2_xnnpack_fp32.pte
```

## XNNPACK quantization + delegation
The following command will produce an XNNPACK quantized and delegated model `mv2_xnnpack_q8.pte` that can be run using XNNPACK's operators. It will also print out the lowered graph, showing what parts of the models have been lowered to XNNPACK via `executorch_call_delegate`.

```bash
python3 -m examples.backend.xnnpack_examples --model_name="mv2" --quantize --delegate
```

Once we have the model binary (pte) file, then let's run it with Executorch runtime using the `xnn_executor_runner`.

```bash
buck2 run examples/backend:xnn_executor_runner -- --model_path ./mv2_xnnpack_q8.pte
```
17 changes: 3 additions & 14 deletions examples/backend/TARGETS
Original file line number Diff line number Diff line change
@@ -1,19 +1,8 @@
load("@fbsource//xplat/executorch/build:runtime_wrapper.bzl", "runtime")
# Any targets that should be shared between fbcode and xplat must be defined in
# targets.bzl. This file can contain fbcode-only targets.

load(":targets.bzl", "define_common_targets")

oncall("executorch")

define_common_targets()

runtime.python_binary(
name = "xnnpack_examples",
main_src = "xnnpack_examples.py",
deps = [
"//caffe2:torch",
"//executorch/backends/xnnpack:xnnpack_preprocess",
"//executorch/backends/xnnpack/partition:xnnpack_partitioner",
"//executorch/examples/models:models",
"//executorch/examples/quantization:quant_utils",
"//executorch/exir/backend:backend_api",
],
)
24 changes: 23 additions & 1 deletion examples/backend/targets.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,29 @@ def define_common_targets():
TARGETS and BUCK files that call this function.
"""

# executor runner for XNNPACK Backend and portable kernels.
runtime.python_binary(
name = "xnnpack_examples",
main_module = "executorch.examples.backend.xnnpack_examples",
deps = [
":xnnpack_examples_lib",
],
)

runtime.python_library(
name = "xnnpack_examples_lib",
srcs = [
"xnnpack_examples.py",
],
deps = [
"//executorch/backends/xnnpack/partition:xnnpack_partitioner",
"//executorch/examples/models:models",
"//executorch/examples/quantization:quant_utils",
"//executorch/exir:lib",
"//executorch/exir/backend:backend_api",
],
)

# executor_runner for XNNPACK Backend and portable kernels.
runtime.cxx_binary(
name = "xnn_executor_runner",
srcs = [],
Expand Down
3 changes: 2 additions & 1 deletion examples/backend/xnnpack_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,8 @@

exec_prog = edge.to_executorch()
buffer = exec_prog.buffer
quant_tag = "_quantize" if args.quantize else ""

quant_tag = "q8" if args.quantize else "fp32"
filename = f"{args.model_name}_xnnpack_{quant_tag}.pte"
logging.info(f"Saving exported program to {filename}.")
with open(filename, "wb") as f:
Expand Down