-
Notifications
You must be signed in to change notification settings - Fork 607
Introduce GenerationConfig #10228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce GenerationConfig #10228
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10228
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ef7d4ca with merge base f911567 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D73091676 |
|
||
if (warmup) { | ||
runner.warmup(prompt, seq_len); | ||
runner.warmup(prompt, /*max_new_tokens=*/seq_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be added in the internal runner as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which internal runner?
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
707bda8
to
74cfb7f
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
74cfb7f
to
5ecf7b7
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
5ecf7b7
to
834fac2
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Pull Request resolved: #10228 Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
834fac2
to
72cbdf1
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
72cbdf1
to
605ff4d
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
2fc8d51
to
163ccea
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
163ccea
to
e89ba89
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
e89ba89
to
0357334
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Pull Request resolved: #10228 Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
0357334
to
febbfa6
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
febbfa6
to
4009fda
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Pull Request resolved: #10228 Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
4009fda
to
9fe9659
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
9fe9659
to
4e038ea
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
4e038ea
to
6004124
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
6004124
to
ef7d4ca
Compare
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Differential Revision: D73091676 Pull Request resolved: pytorch#10228
Summary:
Started to implement #9341
Started to fix #8495
This PR introduces
GenerationConfig
which contains the configs that can be changed across different invocations ofgenerate()
.For example,
temperature
is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we callgenerate()
.Similarly we put
echo
andwarming
into the config.We also allow both
seq_len
andmax_new_tokens
to be passed through the config and we determine the value ofmax_new_tokens
based on these 2 config values, pte file metadata as well as the number of prompt tokens.Differential Revision: D73091676