Skip to content

Qualcomm AI Engine Direct - Support topk #5870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

winskuo-quic
Copy link
Collaborator

@winskuo-quic winskuo-quic commented Oct 4, 2024

Summary

  • Support topK
  • Properly decompose einsum for quantization annotation to work properly
  • Unify warning messages in op builder
  • Add UT

Copy link

pytorch-bot bot commented Oct 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5870

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c2712dd with merge base 2e4e17c (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 4, 2024
@winskuo-quic
Copy link
Collaborator Author

Hi @cccclai,
This PR is to enable topK op builder and properly decompose einsum for quantization.
Please have a look.
Thanks.

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! In the long term, it will be great to seperate changes such that one PR is for one feature. This PR seems including other stuff more than topk

from torch.fx.experimental.proxy_tensor import make_fx


class DecomposeEinsum(ExportPass):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern seems common enough and we will see if we support this kind of logic more easily.

@@ -416,6 +416,17 @@ def forward(self, x):
return torch.sum(self.first(x), dim=(2, 3), keepdim=False)


class Conv2dTopK(torch.nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it's just to test this kind of common pattern, and they won't be ussed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is just for testing purpose to check layout transform is working properly.

super().__init__()

def forward(self, i, j):
return torch.einsum("i,j->ij", i, j)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is einsum used somewhere?

Copy link
Collaborator Author

@winskuo-quic winskuo-quic Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is einsum operation in MOEFeedForward under executorch.examples.models.llama2.llama_transformer. Even though we are not using MOEFeedForward now, but we think it is a good test case to ensure our topK is working as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, so actually the einsum is probably not the best for qnn backend, it's better to replace with other way (I try it before) to get a better performance. But it's still good to run einsum.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@winskuo-quic
Copy link
Collaborator Author

winskuo-quic commented Oct 7, 2024

This looks great! In the long term, it will be great to seperate changes such that one PR is for one feature. This PR seems including other stuff more than topK

Thanks for all the suggestions, and I will separate individual features into smaller PRs in the future. Sorry that the title is a little misleading. Originally, we were trying to enable topK and used MOEFeedForward as a test case since there is topK in there. However, we ran into other issues like einsum decomposition, etc. I will keep the title more accurate in the future and make sure each PR is focusing on one feature.

Thanks!

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Oct 8, 2024

Hey I may need to get #5925 checked in first because it needs to get cherry pick. Then I'll merge this one.

@winskuo-quic
Copy link
Collaborator Author

Hey I may need to get #5925 checked in first because it needs to get cherry pick. Then I'll merge this one.

Sounds good! I can also rebase after #5925 is merged if there are any conflicts.

@cccclai
Copy link
Contributor

cccclai commented Oct 9, 2024

Hey I may need to get #5925 checked in first because it needs to get cherry pick. Then I'll merge this one.

Sounds good! I can also rebase after #5925 is merged if there are any conflicts.

Hi sorry for keeping you waiting, another PR #6025 needs to be merged and also needed for beta...the beta release is happening in 2 days and hopefully that is the last PR

@cccclai
Copy link
Contributor

cccclai commented Oct 10, 2024

Hi it's ready to merge now, can you rebase? Thank you

@winskuo-quic
Copy link
Collaborator Author

Hi it's ready to merge now, can you rebase? Thank you

I have just rebased. Thanks

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cccclai merged this pull request in d094b09.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants