Skip to content

Qualcomm AI Engine Direct - FbNet enablement #2706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion backends/qualcomm/scripts/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ if [ "$BUILD_AARCH64" = true ]; then
-DCMAKE_INSTALL_PREFIX=$BUILD_ROOT \
-DEXECUTORCH_BUILD_QNN=ON \
-DEXECUTORCH_BUILD_SDK=ON \
-DFLATCC_TEST=OFF \
Copy link
Contributor

@cccclai cccclai Mar 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason we turn it on? I guess I didn't realize it was OFF before

Copy link
Contributor Author

@chunit-quic chunit-quic Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We explicitly turn OFF it before.
Because recently PR 2466 turn it off by default, we don't need to set it again here.

-DEXECUTORCH_ENABLE_EVENT_TRACER=ON \
-DQNN_SDK_ROOT=$QNN_SDK_ROOT \
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
Expand Down
60 changes: 57 additions & 3 deletions backends/qualcomm/tests/test_qnn_delegate.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@
from executorch.examples.models.edsr import EdsrModel
from executorch.examples.models.inception_v3 import InceptionV3Model
from executorch.examples.models.inception_v4 import InceptionV4Model
from executorch.examples.models.llama2 import Llama2Model

# from executorch.examples.models.llama2 import Llama2Model
from executorch.examples.models.mobilebert import MobileBertModelExample
from executorch.examples.models.mobilenet_v2 import MV2Model
from executorch.examples.models.mobilenet_v3 import MV3Model
Expand Down Expand Up @@ -439,7 +440,8 @@ def test_qnn_backend_example_models(self):
EdsrModel(),
InceptionV3Model(),
InceptionV4Model(),
Llama2Model(),
# The module of llama is changing frequently. Reopen it when it's stable
# Llama2Model(),
MV2Model(),
MV3Model(),
MobileBertModelExample(),
Expand Down Expand Up @@ -922,7 +924,8 @@ def test_qnn_backend_example_models(self):
{"module": EdsrModel(), "annotation": ()},
{"module": InceptionV3Model(), "annotation": ()},
{"module": InceptionV4Model(), "annotation": ()},
{"module": Llama2Model(), "annotation": ()},
# The module of llama is changing frequently. Reopen it when it's stable
# {"module": Llama2Model(), "annotation": ()},
{"module": MV2Model(), "annotation": ()},
{"module": MV3Model(), "annotation": ()},
# only works on QNN 2.12 so far
Expand Down Expand Up @@ -1221,6 +1224,51 @@ def test_qnn_backend_shared_buffer(self):
)


class TestExampleOssScript(TestQNN):
def required_envs(self, conditions=None) -> bool:
conditions = [] if conditions is None else conditions
return all(
[
self.executorch_root,
self.artifact_dir,
*conditions,
]
)

def test_fbnet(self):
if not self.required_envs([self.image_dataset]):
self.skipTest("missing required envs")

cmds = [
"python",
f"{self.executorch_root}/examples/qualcomm/oss_scripts/fbnet.py",
"--dataset",
self.image_dataset,
"--artifact",
self.artifact_dir,
"--build_folder",
self.build_folder,
"--device",
self.device,
"--model",
self.model,
"--ip",
self.ip,
"--port",
str(self.port),
]
if self.host:
cmds.extend(["--host", self.host])

p = subprocess.Popen(cmds, stdout=subprocess.DEVNULL)
with Listener((self.ip, self.port)) as listener:
conn = listener.accept()
p.communicate()
msg = json.loads(conn.recv())
self.assertGreaterEqual(msg["top_1"], 60)
self.assertGreaterEqual(msg["top_5"], 90)


class TestExampleScript(TestQNN):
def required_envs(self, conditions=None) -> bool:
conditions = [] if conditions is None else conditions
Expand Down Expand Up @@ -1442,6 +1490,9 @@ def test_deeplab_v3(self):
self.assertGreaterEqual(msg["MIoU"], 0.55)

def test_dummy_llama2(self):
self.skipTest(
"The module of llama is changing frequently. Reopen it when it's stable"
)
if not self.required_envs():
self.skipTest("missing required envs")

Expand Down Expand Up @@ -1476,6 +1527,9 @@ def test_dummy_llama2(self):

@unittest.expectedFailure
def test_ptq_dummy_llama2(self):
self.skipTest(
"The module of llama is changing frequently. Reopen it when it's stable"
)
if not self.required_envs():
self.skipTest("missing required envs")

Expand Down
6 changes: 3 additions & 3 deletions build/executorch-config.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,13 @@ set_target_properties(
target_include_directories(portable_kernels INTERFACE ${_root})

if(CMAKE_BUILD_TYPE MATCHES "Debug")
set(FLATCC_LIB flatcc_d)
set(FLATCCRT_LIB flatccrt_d)
else()
set(FLATCC_LIB flatcc)
set(FLATCCRT_LIB flatccrt)
endif()

set(lib_list
etdump bundled_program extension_data_loader ${FLATCC_LIB} mpsdelegate
etdump bundled_program extension_data_loader ${FLATCCRT_LIB} mpsdelegate
qnn_executorch_backend portable_ops_lib extension_module xnnpack_backend
XNNPACK cpuinfo pthreadpool vulkan_backend optimized_kernels
optimized_ops_lib optimized_native_cpu_ops_lib
Expand Down
128 changes: 128 additions & 0 deletions examples/qualcomm/oss_scripts/fbnet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Copyright (c) Qualcomm Innovation Center, Inc.
# All rights reserved
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

import json
import os
import re
import sys
from multiprocessing.connection import Client

import numpy as np
import timm
from executorch.backends.qualcomm.quantizer.quantizer import QuantDtype
from executorch.examples.qualcomm.scripts.inception_v4 import get_dataset
from executorch.examples.qualcomm.scripts.utils import (
build_executorch_binary,
make_output_dir,
setup_common_args_and_variables,
SimpleADB,
topk_accuracy,
)


if __name__ == "__main__":
parser = setup_common_args_and_variables()
parser.add_argument(
"-a",
"--artifact",
help="path for storing generated artifacts by this example. Default ./fbnet",
default="./fbnet",
type=str,
)

parser.add_argument(
"-d",
"--dataset",
help=(
"path to the validation folder of ImageNet dataset. "
"e.g. --dataset imagenet-mini/val "
"for https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000)"
),
type=str,
required=True,
)

args = parser.parse_args()

if not args.compile_only and args.device is None:
raise RuntimeError(
"device serial is required if not compile only. "
"Please specify a device serial by -s/--device argument."
)

# ensure the working directory exist.
os.makedirs(args.artifact, exist_ok=True)

instance = timm.create_model("fbnetc_100", pretrained=True).eval()

data_num = 100
inputs, targets, input_list = get_dataset(
dataset_path=f"{args.dataset}",
data_size=data_num,
)

pte_filename = "fbnet"

build_executorch_binary(
instance,
inputs[0],
args.model,
f"{args.artifact}/{pte_filename}",
inputs,
quant_dtype=QuantDtype.use_8a8w,
)

if args.compile_only:
sys.exit(0)

adb = SimpleADB(
qnn_sdk=os.getenv("QNN_SDK_ROOT"),
artifact_path=f"{args.build_folder}",
pte_path=f"{args.artifact}/{pte_filename}.pte",
workspace=f"/data/local/tmp/executorch/{pte_filename}",
device_id=args.device,
host_id=args.host,
soc_model=args.model,
)
adb.push(inputs=inputs, input_list=input_list)
adb.execute()

# collect output data
output_data_folder = f"{args.artifact}/outputs"
make_output_dir(output_data_folder)

output_raws = []

def post_process():
for f in sorted(
os.listdir(output_data_folder), key=lambda f: int(f.split("_")[1])
):
filename = os.path.join(output_data_folder, f)
if re.match(r"^output_[0-9]+_[1-9].raw$", f):
os.remove(filename)
else:
output = np.fromfile(filename, dtype=np.float32)
output_raws.append(output)

adb.pull(output_path=args.artifact, callback=post_process)

# top-k analysis
predictions = []
for i in range(data_num):
predictions.append(
np.fromfile(
os.path.join(output_data_folder, f"output_{i}_0.raw"), dtype=np.float32
)
)

k_val = [1, 5]
topk = [topk_accuracy(predictions, targets, k).item() for k in k_val]
if args.ip and args.port != -1:
with Client((args.ip, args.port)) as conn:
conn.send(json.dumps({f"top_{k}": topk[i] for i, k in enumerate(k_val)}))
else:
for i, k in enumerate(k_val):
print(f"top_{k}->{topk[i]}%")
1 change: 1 addition & 0 deletions examples/qualcomm/oss_scripts/install_requirements.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pip install timm
3 changes: 3 additions & 0 deletions examples/qualcomm/scripts/dummy_llama2.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ def create_device_inputs(example_inputs, use_kv_cache):


if __name__ == "__main__":
print(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you run into any issue with the script?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I test it last week and it seems ok

Copy link
Contributor Author

@chunit-quic chunit-quic Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cccclai,

We found some unideal behavior in our CI. For the following reasons we think it's better to have this warning:

  1. In 8a8w case, the output shape seems to be different from what it has before.
 python dummy_llama2.py --ptq 8a8w ...
  1. In 16a4w case, it even fails to export now.
 python dummy_llama2.py --ptq 16a4w ...
  1. Prevent from creating too many issues. bucause users might want to try it, but we are still working on some of its components.

I test it last week and it seems ok

Would you mind to share your command please? We can also reproduce it and find what the difference. Thanks! :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I take my word back - I just try export the model and see this error when I try to load the model in the runtime

[INFO] [Qnn ExecuTorch]: create QNN Logger with log_level 2
[WARNING] [Qnn ExecuTorch]:  <W> Initializing HtpProvider

[WARNING] [Qnn ExecuTorch]:  <W> Function not called, PrepareLib isn't loaded!

[INFO] [Qnn ExecuTorch]: Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO] [Qnn ExecuTorch]: Caching: Caching is in RESTORE MODE.
[WARNING] [Qnn ExecuTorch]:  <W> sg_stubPtr is not null, skip loadRemoteSymbols


[ERROR] [Qnn ExecuTorch]:  <E> DspTransport.openSession qnn_open failed, 0x80000406

[ERROR] [Qnn ExecuTorch]:  <E> IDspTransport: Unable to load lib 0x80000406

[ERROR] [Qnn ExecuTorch]:  <E> DspTransport failed,cannot open session, error 0x00000009

[ERROR] [Qnn ExecuTorch]:  <E> Unable to load Skel Library. transportStatus: 9

[ERROR] [Qnn ExecuTorch]:  <E> Failed to retrieve skel build id: err: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to create transport for device, error: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to load skel, error: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Transport layer setup failed: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to parse default platform info: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to load default platform info: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to parse platform config: 1008

[ERROR] [Qnn ExecuTorch]: Failed to create device_handle for Backend ID 6, error=1008
E 00:00:00.245462 executorch:QnnManager.cpp:154] Fail to configure Qnn device
E 00:00:00.245471 executorch:QnnExecuTorchBackend.cpp:54] Fail to initialize Qnn Manager
E 00:00:00.245478 executorch:method.cpp:106] Init failed for backend QnnBackend: 0x1
F 00:00:00.245497 executorch:qnn_executor_runner.cpp:215] In function main(), assert failed (method.ok()): Loading of method forward failed with status 0x1
Aborted

Any chance you know the reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh also I think the code change in llama_transformer.py might be the culprit when the issue you saw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the error message might be just for me because I only have SM8450. Just open an issue here #2788

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh also I think the code change in llama_transformer.py might be the culprit when the issue you saw.

Thank you for pointing out the possibility. We will investigate it later.

Actually the error message might be just for me because I only have SM8450. Just open an issue here #2788

We will find a 8450 device and try to reproduce it. Once we have any news we will reply at issue 2788. Thank you for report.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I ask what device you've been using? Is it SM8450?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I ususally work on SM8550. I don't evevn test a 8450 device personally.

"[WARNING] The module of llama is changing frequently. This script might not work"
)
parser = setup_common_args_and_variables()
parser.add_argument(
"-a",
Expand Down