Skip to content

Qualcomm AI Engine Direct - LLAMA2 Infrastructure #2020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

chunit-quic
Copy link
Contributor

  1. OPs
  • Add pow_tensor_scalar op
  • Add rsqrt op
  • Add sigmoid op
  • Refine axis handling of cat op
  • Refine parameters related functions
  1. Passes
  • Add AnnotateDecomposed for unbind and stak op
  • Add DecomposeSilu for quantizer
  • Add ReplaceInfBuffer for quantizer
  • Change pass name ConvertAddmmmmWithLinear to ConvertToLinear
  • Change pass name ConvertScaledDotProductAttention to DecomposeScaledDotProductAttention
  • Support more args for sdpa op in DecomposeScaledDotProductAttention
  • Support mm case for ConvertToLinear
  • Move q_ops and dq_ops to pass/utils.py
  1. Tests
  • Add dummy llama2 test script
  • Add single op test cases
  1. Others
  • Fix error of popping missing buffer
  • Reorder the order of test models
  • Reorder the order of op in qnn_constant

Copy link

pytorch-bot bot commented Feb 21, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/2020

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Merge Blocking SEVs

There is 1 active merge blocking SEVs. Please view them below:

If you must merge, use @pytorchbot merge -f.

✅ No Failures

As of commit fab30bc with merge base 8fed60b (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2024
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1. OPs
- Add pow_tensor_scalar op
- Add rsqrt op
- Add sigmoid op
- Refine axis handling of cat op
- Refine parameters related functions
2. Passes
- Add AnnotateDecomposed for unbind and stak op
- Add DecomposeSilu for quantizer
- Add ReplaceInfBuffer for quantizer
- Change pass name ConvertAddmmmmWithLinear to ConvertToLinear
- Change pass name ConvertScaledDotProductAttention to DecomposeScaledDotProductAttention
- Support more args for sdpa op in DecomposeScaledDotProductAttention
- Support mm case for ConvertToLinear
- Move q_ops and dq_ops to pass/utils.py
3. Tests
- Add dummy llama2 test script
- Add single op test cases
4. Others
- Fix error of popping missing buffer
- Reorder the order of test models
- Reorder the order of op in qnn_constant
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -250,7 +250,10 @@ def _partition_and_lower_one_graph_module(
# Delete the consumed buffers
buffer_name = toplevel_signature.inputs_to_buffers.pop(node.name)
toplevel_signature.buffers.remove(buffer_name)
owning_program.state_dict.pop(buffer_name)
if buffer_name in owning_program.state_dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks for catching this case

@facebook-github-bot
Copy link
Contributor

@cccclai merged this pull request in f707590.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants