[executorch] generation.py with kv cache #3030

lucylq · 2024-04-12T23:08:15Z

No description provided.

pytorch-bot · 2024-04-12T23:08:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3030

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unrelated Failure

As of commit 9a4c1f6 with merge base 7616d42 ():

NEW FAILURES - The following jobs have failed:

pull / test-llama-runner-linux (fp32, buck2, xnnpack+kv+custom) / linux-job (gh)
RuntimeError: Command docker exec -t 198670697b2253af487cff65b900ac48afe85930b6a4d2e754dcb1a89d9d663d /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, cmake, xnnpack+kv+custom) / linux-job (gh)
RuntimeError: Command docker exec -t 1cd7718313ab931bc20d60232602d2b17d84338499772a92dac4f462b48dc74b /exec failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Lint / lintrunner / linux-job (gh)
>>> Lint for backends/apple/coreml/runtime/util/objc_safe_cast.h:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-04-12T23:11:38Z

@lucylq has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-04-12T23:24:39Z

@lucylq has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

larryliu0820

Lint? Also does this work with our current model? Llama2 and stories

Summary: python e2e generation, using tiktoken tokenizer. using text_completion, haven't tried chat_completion. Test Plan: Imported from GitHub, without a `Test Plan:` line. Command, with prompt "Hello, I am" and seq_len = 10 ``` python -m examples.models.llama2.runner.generation --pte llama_4ckpts_x.pte --tokenizer tokenizer.model --prompt="Hello I am" --temperature=0 --params ../llama-models/llama3/params_less.json --max_gen_len=10 ``` fp32, xnn, kv fp32, xnn same results: ``` Result: [{'generation': ' a 25 year old woman. I am a'}] ``` fp32, xnn, int4 ``` Result: [{'generation': ' interested in the following products: - 1 x'}] ``` fp32, xnn, kv, sdpa (need investigation) ``` Result: [{'generation': 'ฉopteraenthalenthalenthalenthalenthalenthalenthalenthal'}] ``` Reviewed By: larryliu0820 Differential Revision: D56087430 Pulled By: lucylq

facebook-github-bot · 2024-04-15T16:59:32Z

This pull request was exported from Phabricator. Differential Revision: D56087430

facebook-github-bot · 2024-04-15T19:06:14Z

@lucylq merged this pull request in 645256d.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 12, 2024

lucylq force-pushed the lfq.generation-kv-cache-2 branch from 00e9159 to 5c98ea6 Compare April 12, 2024 23:11

lucylq marked this pull request as ready for review April 12, 2024 23:12

lucylq force-pushed the lfq.generation-kv-cache-2 branch from 5c98ea6 to b48bfd1 Compare April 12, 2024 23:24

lucylq changed the title ~~kv cache 2~~ [executorch] generation.py with kv cache Apr 12, 2024

larryliu0820 approved these changes Apr 15, 2024

View reviewed changes

facebook-github-bot force-pushed the lfq.generation-kv-cache-2 branch from b48bfd1 to 9a4c1f6 Compare April 15, 2024 16:59

facebook-github-bot added the fb-exported label Apr 15, 2024

facebook-github-bot closed this in 645256d Apr 15, 2024

facebook-github-bot added the Merged label Apr 15, 2024

mergennachin mentioned this pull request Apr 26, 2024

disclaimer #3376

Closed

lucylq deleted the lfq.generation-kv-cache-2 branch January 24, 2025 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[executorch] generation.py with kv cache #3030

[executorch] generation.py with kv cache #3030

Uh oh!

lucylq commented Apr 12, 2024

Uh oh!

pytorch-bot bot commented Apr 12, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Apr 12, 2024

Uh oh!

facebook-github-bot commented Apr 12, 2024

Uh oh!

larryliu0820 left a comment

Uh oh!

facebook-github-bot commented Apr 15, 2024

Uh oh!

facebook-github-bot commented Apr 15, 2024

Uh oh!

Uh oh!

[executorch] generation.py with kv cache #3030

[executorch] generation.py with kv cache #3030

Uh oh!

Conversation

lucylq commented Apr 12, 2024

Uh oh!

pytorch-bot bot commented Apr 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3030

❌ 2 New Failures, 1 Unrelated Failure

Uh oh!

facebook-github-bot commented Apr 12, 2024

Uh oh!

facebook-github-bot commented Apr 12, 2024

Uh oh!

larryliu0820 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 15, 2024

Uh oh!

facebook-github-bot commented Apr 15, 2024

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 12, 2024 •

edited

Loading