-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: jumpstart async inference config predictor support #3970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -432,7 +432,7 @@ def deploy( | |
predictor = super(JumpStartModel, self).deploy(**deploy_kwargs.to_kwargs_dict()) | ||
|
||
# If no predictor class was passed, add defaults to predictor | ||
if self.orig_predictor_cls is None: | ||
if self.orig_predictor_cls is None and async_inference_config is None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't have enough context here, but can you confirm that we will not see similar issues for other types of configs as well - serverless_inference_config, data_capture_config ? i.e. if that is None, do we need to return a defaultPredictor as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be honest, I am really not sure. I wanted to resolve the customer ticket ASAP, but I will do a deep dive into other kinds of issues with predictor being set under the hood. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks for the callout @akrishna1995. IIUC, Data capture wouldn't impact the predictor, but could you please let us know if you disagree. As of today, JumpStart does not claim to support serverless inference for any model. It might work for some of our smaller models, but for foundation models it wouldn't: they all run on GPU. We will look into it however for our older models, but can this be in a separate PRs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure - Thanks for the confirmation, rest is LGTM, will approve and merge the PR |
||
return get_default_predictor( | ||
predictor=predictor, | ||
model_id=self.model_id, | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the link in the description to our ticket corp ( link this bug instead: #3969)