Skip to content

Accept model type parameter in export_llama #5910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

Conversation

jackzhxng
Copy link
Contributor

@jackzhxng jackzhxng commented Oct 5, 2024

Copy link

pytorch-bot bot commented Oct 5, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5910

Note: Links to docs will display an error until the docs builds have been completed.

❌ 14 New Failures, 2 Cancelled Jobs

As of commit 96ba40b with merge base 8f9fb7e (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 5, 2024
@jackzhxng jackzhxng force-pushed the jz/cleanup branch 2 times, most recently from 64c4fe1 to b116097 Compare October 8, 2024 06:42
@jackzhxng jackzhxng force-pushed the jz/tt-llama branch 2 times, most recently from 0c46b2e to aa30bca Compare October 8, 2024 07:21
facebook-github-bot pushed a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Differential Revision: D64027696

Pulled By: dvorjackz
jackzhxng added a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Pull Request resolved: #5765

Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz

fbshipit-source-id: 15ecfb458c6194159140d4c601e5443a2e524fdc
facebook-github-bot pushed a commit that referenced this pull request Oct 11, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Differential Revision: D64145852

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 14, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: dbort

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot facebook-github-bot force-pushed the jz/cleanup branch 2 times, most recently from ad17e72 to fb1312f Compare October 15, 2024 00:52
facebook-github-bot pushed a commit that referenced this pull request Oct 15, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: dbort

Differential Revision: D64145852

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 15, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Pull Request resolved: #5859

Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```

Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: malfet, dbort

Differential Revision: D64145852

Pulled By: dvorjackz

fbshipit-source-id: daeee834955e154e7c8262ce776bd3039991027d
@jackzhxng jackzhxng changed the base branch from jz/cleanup to main October 15, 2024 21:04
@facebook-github-bot
Copy link
Contributor

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jackzhxng jackzhxng closed this Oct 25, 2024
facebook-github-bot pushed a commit that referenced this pull request Nov 11, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 13, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants