Qualcomm AI Engine Direct - LLAMA2 Infrastructure #2020

chunit-quic · 2024-02-21T05:53:51Z

OPs

Add pow_tensor_scalar op
Add rsqrt op
Add sigmoid op
Refine axis handling of cat op
Refine parameters related functions

Passes

Add AnnotateDecomposed for unbind and stak op
Add DecomposeSilu for quantizer
Add ReplaceInfBuffer for quantizer
Change pass name ConvertAddmmmmWithLinear to ConvertToLinear
Change pass name ConvertScaledDotProductAttention to DecomposeScaledDotProductAttention
Support more args for sdpa op in DecomposeScaledDotProductAttention
Support mm case for ConvertToLinear
Move q_ops and dq_ops to pass/utils.py

Tests

Add dummy llama2 test script
Add single op test cases

Others

Fix error of popping missing buffer
Reorder the order of test models
Reorder the order of op in qnn_constant

pytorch-bot · 2024-02-21T05:53:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/2020

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Merge Blocking SEVs

There is 1 active merge blocking SEVs. Please view them below:

(merge blocking) CI is too red, viable/strict branch didn't update for 5 days

If you must merge, use @pytorchbot merge -f.

✅ No Failures

As of commit fab30bc with merge base 8fed60b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-02-21T17:55:33Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-02-21T22:18:30Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1. OPs - Add pow_tensor_scalar op - Add rsqrt op - Add sigmoid op - Refine axis handling of cat op - Refine parameters related functions 2. Passes - Add AnnotateDecomposed for unbind and stak op - Add DecomposeSilu for quantizer - Add ReplaceInfBuffer for quantizer - Change pass name ConvertAddmmmmWithLinear to ConvertToLinear - Change pass name ConvertScaledDotProductAttention to DecomposeScaledDotProductAttention - Support more args for sdpa op in DecomposeScaledDotProductAttention - Support mm case for ConvertToLinear - Move q_ops and dq_ops to pass/utils.py 3. Tests - Add dummy llama2 test script - Add single op test cases 4. Others - Fix error of popping missing buffer - Reorder the order of test models - Reorder the order of op in qnn_constant

facebook-github-bot · 2024-02-22T05:51:46Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2024-02-22T06:21:13Z

exir/backend/backend_api.py

@@ -250,7 +250,10 @@ def _partition_and_lower_one_graph_module(
                # Delete the consumed buffers
                buffer_name = toplevel_signature.inputs_to_buffers.pop(node.name)
                toplevel_signature.buffers.remove(buffer_name)
-                owning_program.state_dict.pop(buffer_name)
+                if buffer_name in owning_program.state_dict:


Oh thanks for catching this case

facebook-github-bot · 2024-02-22T07:40:37Z

@cccclai merged this pull request in f707590.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2024

cccclai approved these changes Feb 21, 2024

View reviewed changes

chunit-quic force-pushed the llama_infra branch from abbadfc to fab30bc Compare February 22, 2024 05:22

cccclai reviewed Feb 22, 2024

View reviewed changes

facebook-github-bot closed this in f707590 Feb 22, 2024

facebook-github-bot added the Merged label Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qualcomm AI Engine Direct - LLAMA2 Infrastructure #2020

Qualcomm AI Engine Direct - LLAMA2 Infrastructure #2020

Uh oh!

chunit-quic commented Feb 21, 2024

Uh oh!

pytorch-bot bot commented Feb 21, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Feb 21, 2024

Uh oh!

facebook-github-bot commented Feb 21, 2024

Uh oh!

facebook-github-bot commented Feb 22, 2024

Uh oh!

cccclai Feb 22, 2024

Uh oh!

facebook-github-bot commented Feb 22, 2024

Uh oh!

Uh oh!

Qualcomm AI Engine Direct - LLAMA2 Infrastructure #2020

Qualcomm AI Engine Direct - LLAMA2 Infrastructure #2020

Uh oh!

Conversation

chunit-quic commented Feb 21, 2024

Uh oh!

pytorch-bot bot commented Feb 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/2020

❗ 1 Merge Blocking SEVs

✅ No Failures

Uh oh!

facebook-github-bot commented Feb 21, 2024

Uh oh!

facebook-github-bot commented Feb 21, 2024

Uh oh!

facebook-github-bot commented Feb 22, 2024

Uh oh!

cccclai Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Feb 22, 2024

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 21, 2024 •

edited

Loading