Skip to content

Llama2 model cleanup #5859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Llama2 model cleanup #5859

wants to merge 1 commit into from

Conversation

jackzhxng
Copy link
Contributor

@jackzhxng jackzhxng commented Oct 3, 2024

Summary

  • Removes redundant steps in the Llama2 export
  • Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
  • Comments and orders code more clearly

PR chain:

Test plan

Ensure export + eval is similar before and after. Before:

wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}

After:

wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}

Copy link

pytorch-bot bot commented Oct 3, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5859

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fb1312f with merge base 3a7056e (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 3, 2024
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from f205927 to 0902dea Compare October 4, 2024 20:35
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 0902dea to 6cd759d Compare October 4, 2024 20:46
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 6cd759d to a6b8704 Compare October 7, 2024 20:49
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 6a285ea to 9be5f57 Compare October 8, 2024 06:41
@jackzhxng jackzhxng marked this pull request as ready for review October 8, 2024 07:13
facebook-github-bot pushed a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Differential Revision: D64027696

Pulled By: dvorjackz
jackzhxng added a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Differential Revision: D64027696

Pulled By: dvorjackz
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 63e3b9e to 6ff6615 Compare October 8, 2024 20:09
facebook-github-bot pushed a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Pull Request resolved: #5765

Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz

fbshipit-source-id: 15ecfb458c6194159140d4c601e5443a2e524fdc
@jackzhxng jackzhxng changed the base branch from jz/eager-model-inputs to main October 9, 2024 23:10
@facebook-github-bot
Copy link
Contributor

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Oct 11, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64145852

facebook-github-bot pushed a commit that referenced this pull request Oct 14, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: dbort

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64145852

Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: dbort

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64145852

@facebook-github-bot
Copy link
Contributor

@dvorjackz merged this pull request in 4745070.

facebook-github-bot pushed a commit that referenced this pull request Nov 11, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 13, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants