Llama2 model cleanup #5859

jackzhxng · 2024-10-03T23:33:53Z

Summary

Removes redundant steps in the Llama2 export
Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
Comments and orders code more clearly

PR chain:

Add kwarg example inputs to eager model base
YOU ARE HERE ~> Llama2 model cleanup
Accept model type parameter in export_llama
Export TorchTune llama3_2_vision in ET

Test plan

Ensure export + eval is similar before and after. Before:

wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}

After:

wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}

pytorch-bot · 2024-10-03T23:33:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5859

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fb1312f with merge base 3a7056e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519 PR chain: - **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Exported Stories110M model. ``` wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv ``` Differential Revision: D64027696 Pulled By: dvorjackz

Summary: For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519 PR chain: - **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Exported Stories110M model. ``` wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv ``` Reviewed By: tarun292 Differential Revision: D64027696 Pulled By: dvorjackz

Summary: For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519 PR chain: - **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Pull Request resolved: #5765 Test Plan: Exported Stories110M model. ``` wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv ``` Reviewed By: tarun292 Differential Revision: D64027696 Pulled By: dvorjackz fbshipit-source-id: 15ecfb458c6194159140d4c601e5443a2e524fdc

facebook-github-bot · 2024-10-09T23:10:35Z

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: - Removes redundant steps in the Llama2 export - Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal) - Comments and orders code more clearly PR chain: - [Add kwarg example inputs to eager model base](#5765) - **YOU ARE HERE ~>** [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Ensure export + eval is similar before and after for Stories 110M: ``` python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000 ``` Before: ``` wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` After: ``` wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` Differential Revision: D64145852 Pulled By: dvorjackz

facebook-github-bot · 2024-10-11T01:58:50Z

This pull request was exported from Phabricator. Differential Revision: D64145852

Summary: - Removes redundant steps in the Llama2 export - Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal) - Comments and orders code more clearly PR chain: - [Add kwarg example inputs to eager model base](#5765) - **YOU ARE HERE ~>** [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Ensure export + eval is similar before and after for Stories 110M: ``` python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000 ``` Before: ``` wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` After: ``` wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` Reviewed By: dbort Differential Revision: D64145852 Pulled By: dvorjackz

facebook-github-bot · 2024-10-14T23:31:58Z

This pull request was exported from Phabricator. Differential Revision: D64145852

Summary: - Removes redundant steps in the Llama2 export - Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal) - Comments and orders code more clearly PR chain: - [Add kwarg example inputs to eager model base](#5765) - **YOU ARE HERE ~>** [Llama2 model cleanup](#5859) - [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Test Plan: Ensure export + eval is similar before and after for Stories 110M: ``` python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000 ``` Before: ``` wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` After: ``` wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'} ``` Reviewed By: dbort Differential Revision: D64145852 Pulled By: dvorjackz

facebook-github-bot · 2024-10-15T00:53:10Z

This pull request was exported from Phabricator. Differential Revision: D64145852

facebook-github-bot · 2024-10-15T04:21:21Z

@dvorjackz merged this pull request in 4745070.

Summary: Specify model to export in the CLI. Test Plan: Exported the stories 110M model. ``` python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv ``` PR chain: - [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Runner changes for TorchTune Llama3.2 vision text decoder](#6610) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Differential Revision: D65612837 Pulled By: dvorjackz

Summary: Specify model to export in the CLI. Test Plan: Exported the stories 110M model. ``` python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv ``` PR chain: - [Add kwarg example inputs to eager model base](#5765) - [Llama2 model cleanup](#5859) - **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910) - [Export TorchTune llama3_2_vision in ET](#5911) - [Runner changes for TorchTune Llama3.2 vision text decoder](#6610) - [Add et version of TorchTune MHA for swapping with custom op](#5912) Reviewed By: helunwencser Differential Revision: D65612837 Pulled By: dvorjackz

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 3, 2024

jackzhxng force-pushed the jz/eager-model-inputs branch from f205927 to 0902dea Compare October 4, 2024 20:35

jackzhxng force-pushed the jz/cleanup branch from 2ae1c80 to 04aa7e8 Compare October 4, 2024 20:36

jackzhxng force-pushed the jz/eager-model-inputs branch from 0902dea to 6cd759d Compare October 4, 2024 20:46

jackzhxng force-pushed the jz/cleanup branch from 04aa7e8 to b78c082 Compare October 4, 2024 20:47

jackzhxng force-pushed the jz/eager-model-inputs branch from 6cd759d to a6b8704 Compare October 7, 2024 20:49

jackzhxng force-pushed the jz/cleanup branch from b78c082 to b48f917 Compare October 7, 2024 20:55

This was referenced Oct 7, 2024

Add kwarg example inputs to eager model base #5765

Closed

Add et version of TorchTune MHA for swapping with custom op #5912

Closed

Export TorchTune llama3_2_vision in ET #5911

Merged

Accept model type parameter in export_llama #5910

Closed

jackzhxng force-pushed the jz/cleanup branch from b48f917 to 64c4fe1 Compare October 7, 2024 22:51

jackzhxng force-pushed the jz/eager-model-inputs branch from 6a285ea to 9be5f57 Compare October 8, 2024 06:41

jackzhxng force-pushed the jz/cleanup branch from 64c4fe1 to b116097 Compare October 8, 2024 06:42

jackzhxng marked this pull request as ready for review October 8, 2024 07:13

facebook-github-bot force-pushed the jz/eager-model-inputs branch from 9be5f57 to 63e3b9e Compare October 8, 2024 17:15

jackzhxng force-pushed the jz/cleanup branch from eebaef4 to af5dfd6 Compare October 8, 2024 17:20

jackzhxng force-pushed the jz/eager-model-inputs branch from 63e3b9e to 6ff6615 Compare October 8, 2024 20:09

jackzhxng force-pushed the jz/cleanup branch from af5dfd6 to d4b9d39 Compare October 8, 2024 20:09

facebook-github-bot force-pushed the jz/eager-model-inputs branch from 6ff6615 to 126eebf Compare October 8, 2024 23:03

facebook-github-bot force-pushed the jz/eager-model-inputs branch from 126eebf to 385e821 Compare October 9, 2024 04:08

facebook-github-bot force-pushed the jz/eager-model-inputs branch from 385e821 to 6f792eb Compare October 9, 2024 16:37

facebook-github-bot force-pushed the jz/eager-model-inputs branch from 6f792eb to d7038e4 Compare October 9, 2024 17:43

jackzhxng force-pushed the jz/cleanup branch from d4b9d39 to 5161153 Compare October 9, 2024 23:09

jackzhxng changed the base branch from jz/eager-model-inputs to main October 9, 2024 23:10

cccclai added the ciflow/trunk label Oct 10, 2024

facebook-github-bot force-pushed the jz/cleanup branch from 5161153 to 27a0116 Compare October 11, 2024 01:58

facebook-github-bot added the fb-exported label Oct 11, 2024

dbort approved these changes Oct 14, 2024

View reviewed changes

facebook-github-bot force-pushed the jz/cleanup branch from 27a0116 to ad17e72 Compare October 14, 2024 23:31

facebook-github-bot force-pushed the jz/cleanup branch from ad17e72 to fb1312f Compare October 15, 2024 00:52

malfet approved these changes Oct 15, 2024

View reviewed changes

facebook-github-bot closed this in 4745070 Oct 15, 2024

facebook-github-bot added the Merged label Oct 15, 2024

This was referenced Oct 25, 2024

Accept model type parameter in export_llama #6507

Merged

Runner changes for TorchTune Llama3.2 vision text decoder #6610

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama2 model cleanup #5859

Llama2 model cleanup #5859

Uh oh!

jackzhxng commented Oct 3, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 3, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Oct 9, 2024

Uh oh!

facebook-github-bot commented Oct 11, 2024

Uh oh!

facebook-github-bot commented Oct 14, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

Uh oh!

Llama2 model cleanup #5859

Llama2 model cleanup #5859

Uh oh!

Conversation

jackzhxng commented Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5859

✅ No Failures

Uh oh!

facebook-github-bot commented Oct 9, 2024

Uh oh!

facebook-github-bot commented Oct 11, 2024

Uh oh!

facebook-github-bot commented Oct 14, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

Uh oh!

jackzhxng commented Oct 3, 2024 •

edited

Loading

pytorch-bot bot commented Oct 3, 2024 •

edited

Loading