Skip to content

Qualcomm AI Engine Direct - FbNet enablement #2706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

chunit-quic
Copy link
Contributor

  • Add test cases
  • Fix compile error

Copy link

pytorch-bot bot commented Mar 27, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/2706

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 3de82d5 with merge base 4111b3f (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 27, 2024
- Add test cases
@@ -39,6 +39,9 @@ def create_device_inputs(example_inputs, use_kv_cache):


if __name__ == "__main__":
print(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you run into any issue with the script?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I test it last week and it seems ok

Copy link
Contributor Author

@chunit-quic chunit-quic Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cccclai,

We found some unideal behavior in our CI. For the following reasons we think it's better to have this warning:

  1. In 8a8w case, the output shape seems to be different from what it has before.
 python dummy_llama2.py --ptq 8a8w ...
  1. In 16a4w case, it even fails to export now.
 python dummy_llama2.py --ptq 16a4w ...
  1. Prevent from creating too many issues. bucause users might want to try it, but we are still working on some of its components.

I test it last week and it seems ok

Would you mind to share your command please? We can also reproduce it and find what the difference. Thanks! :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I take my word back - I just try export the model and see this error when I try to load the model in the runtime

[INFO] [Qnn ExecuTorch]: create QNN Logger with log_level 2
[WARNING] [Qnn ExecuTorch]:  <W> Initializing HtpProvider

[WARNING] [Qnn ExecuTorch]:  <W> Function not called, PrepareLib isn't loaded!

[INFO] [Qnn ExecuTorch]: Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO] [Qnn ExecuTorch]: Caching: Caching is in RESTORE MODE.
[WARNING] [Qnn ExecuTorch]:  <W> sg_stubPtr is not null, skip loadRemoteSymbols


[ERROR] [Qnn ExecuTorch]:  <E> DspTransport.openSession qnn_open failed, 0x80000406

[ERROR] [Qnn ExecuTorch]:  <E> IDspTransport: Unable to load lib 0x80000406

[ERROR] [Qnn ExecuTorch]:  <E> DspTransport failed,cannot open session, error 0x00000009

[ERROR] [Qnn ExecuTorch]:  <E> Unable to load Skel Library. transportStatus: 9

[ERROR] [Qnn ExecuTorch]:  <E> Failed to retrieve skel build id: err: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to create transport for device, error: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to load skel, error: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Transport layer setup failed: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to parse default platform info: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to load default platform info: 1008

[ERROR] [Qnn ExecuTorch]:  <E> Failed to parse platform config: 1008

[ERROR] [Qnn ExecuTorch]: Failed to create device_handle for Backend ID 6, error=1008
E 00:00:00.245462 executorch:QnnManager.cpp:154] Fail to configure Qnn device
E 00:00:00.245471 executorch:QnnExecuTorchBackend.cpp:54] Fail to initialize Qnn Manager
E 00:00:00.245478 executorch:method.cpp:106] Init failed for backend QnnBackend: 0x1
F 00:00:00.245497 executorch:qnn_executor_runner.cpp:215] In function main(), assert failed (method.ok()): Loading of method forward failed with status 0x1
Aborted

Any chance you know the reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh also I think the code change in llama_transformer.py might be the culprit when the issue you saw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the error message might be just for me because I only have SM8450. Just open an issue here #2788

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh also I think the code change in llama_transformer.py might be the culprit when the issue you saw.

Thank you for pointing out the possibility. We will investigate it later.

Actually the error message might be just for me because I only have SM8450. Just open an issue here #2788

We will find a 8450 device and try to reproduce it. Once we have any news we will reply at issue 2788. Thank you for report.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I ask what device you've been using? Is it SM8450?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I ususally work on SM8550. I don't evevn test a 8450 device personally.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -71,7 +71,6 @@ if [ "$BUILD_AARCH64" = true ]; then
-DCMAKE_INSTALL_PREFIX=$BUILD_ROOT \
-DEXECUTORCH_BUILD_QNN=ON \
-DEXECUTORCH_BUILD_SDK=ON \
-DFLATCC_TEST=OFF \
Copy link
Contributor

@cccclai cccclai Mar 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason we turn it on? I guess I didn't realize it was OFF before

Copy link
Contributor Author

@chunit-quic chunit-quic Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We explicitly turn OFF it before.
Because recently PR 2466 turn it off by default, we don't need to set it again here.

@facebook-github-bot
Copy link
Contributor

@cccclai merged this pull request in 15d9ddd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants