[llama-mm] Enable kv cache for MultiHeadAttention #6793

larryliu0820 · 2024-11-12T21:08:00Z

Stack from ghstack (oldest at bottom):

-> [llama-mm] Enable kv cache for MultiHeadAttention #6793

Summary: Change MultiHeadAttention in extension/llm/modules to
support KV cache. Only enable eager but not export yet.

Test Plan: Unit test

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: dd92b38 Pull Request resolved: #6793

pytorch-bot · 2024-11-12T21:08:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6793

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e72ac0b with merge base 4947e27 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ded3830 Pull Request resolved: #6793

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ded3830 Pull Request resolved: #6793 Co-authored-by: Mengwei Liu <[email protected]>

[llama-mm] Enable kv cache for MultiHeadAttention

d4e8c6e

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 12, 2024

Update on "[llama-mm] Enable kv cache for MultiHeadAttention"

e72ac0b

Summary: Change `MultiHeadAttention` in `extension/llm/modules` to support KV cache. Only enable eager but not export yet. Test Plan: Unit test Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

jackzhxng approved these changes Nov 12, 2024

View reviewed changes

larryliu0820 merged commit c711730 into gh/larryliu0820/55/base Nov 12, 2024
39 checks passed

larryliu0820 deleted the gh/larryliu0820/55/head branch November 12, 2024 22:17

larryliu0820 temporarily deployed to cherry-pick-bot November 12, 2024 22:17 — with GitHub Actions Inactive

pytorchbot mentioned this pull request Nov 12, 2024

[llama-mm] Enable kv cache for MultiHeadAttention #6798

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llama-mm] Enable kv cache for MultiHeadAttention #6793

[llama-mm] Enable kv cache for MultiHeadAttention #6793

Uh oh!

larryliu0820 commented Nov 12, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 12, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[llama-mm] Enable kv cache for MultiHeadAttention #6793

[llama-mm] Enable kv cache for MultiHeadAttention #6793

Uh oh!

Conversation

larryliu0820 commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6793

✅ No Failures

Uh oh!

Uh oh!

Uh oh!

larryliu0820 commented Nov 12, 2024 •

edited

Loading

pytorch-bot bot commented Nov 12, 2024 •

edited

Loading