Skip to content

fix: jumpstart async inference config predictor support #3970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

evakravi
Copy link
Member

@evakravi evakravi commented Jun 29, 2023

Issue #, if available:
#3969

Description of changes:
If async_inference_config field is used with JumpStartModel or JumpStartEstimator, we use the predictor associated with async inference, not try to create a predictor with JumpStart-defaulted values.

Testing done:
Wrote unit tests.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@evakravi evakravi requested a review from a team as a code owner June 29, 2023 20:18
@evakravi evakravi requested review from mufaddal-rohawala and removed request for a team June 29, 2023 20:18
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 7ed3e2d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 7ed3e2d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 7ed3e2d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@dgallitelli
Copy link

Refers to #3969

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 7ed3e2d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@codecov-commenter
Copy link

codecov-commenter commented Jun 29, 2023

Codecov Report

Merging #3970 (7ed3e2d) into master (a68faf1) will decrease coverage by 0.73%.
The diff coverage is 100.00%.

❗ Current head 7ed3e2d differs from pull request most recent head 1196c4d. Consider uploading reports for the commit 1196c4d to get more accurate results

@@            Coverage Diff             @@
##           master    #3970      +/-   ##
==========================================
- Coverage   90.32%   89.59%   -0.73%     
==========================================
  Files        1292      305     -987     
  Lines      113972    28188   -85784     
==========================================
- Hits       102940    25256   -77684     
+ Misses      11032     2932    -8100     
Impacted Files Coverage Δ
src/sagemaker/jumpstart/estimator.py 98.36% <100.00%> (ø)
src/sagemaker/jumpstart/model.py 97.82% <100.00%> (ø)

... and 1595 files with indirect coverage changes

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 7ed3e2d
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@akrishna1995 akrishna1995 self-requested a review June 29, 2023 22:42
@@ -973,7 +973,7 @@ def deploy(
)
Copy link
Contributor

@akrishna1995 akrishna1995 Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the link in the description to our ticket corp ( link this bug instead: #3969)

@@ -432,7 +432,7 @@ def deploy(
predictor = super(JumpStartModel, self).deploy(**deploy_kwargs.to_kwargs_dict())

# If no predictor class was passed, add defaults to predictor
if self.orig_predictor_cls is None:
if self.orig_predictor_cls is None and async_inference_config is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have enough context here, but can you confirm that we will not see similar issues for other types of configs as well - serverless_inference_config, data_capture_config ? i.e. if that is None, do we need to return a defaultPredictor as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I am really not sure. I wanted to resolve the customer ticket ASAP, but I will do a deep dive into other kinds of issues with predictor being set under the hood.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the callout @akrishna1995.

IIUC, Data capture wouldn't impact the predictor, but could you please let us know if you disagree.

As of today, JumpStart does not claim to support serverless inference for any model. It might work for some of our smaller models, but for foundation models it wouldn't: they all run on GPU. We will look into it however for our older models, but can this be in a separate PRs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - Thanks for the confirmation, rest is LGTM, will approve and merge the PR

Copy link
Contributor

@akrishna1995 akrishna1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are LGTM overall, minor comment for confirmation and request to remove amazon specific ticketing link from issue description

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 1196c4d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 1196c4d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 1196c4d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 1196c4d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 1196c4d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@@ -432,7 +432,7 @@ def deploy(
predictor = super(JumpStartModel, self).deploy(**deploy_kwargs.to_kwargs_dict())

# If no predictor class was passed, add defaults to predictor
if self.orig_predictor_cls is None:
if self.orig_predictor_cls is None and async_inference_config is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the callout @akrishna1995.

IIUC, Data capture wouldn't impact the predictor, but could you please let us know if you disagree.

As of today, JumpStart does not claim to support serverless inference for any model. It might work for some of our smaller models, but for foundation models it wouldn't: they all run on GPU. We will look into it however for our older models, but can this be in a separate PRs?

Copy link
Contributor

@akrishna1995 akrishna1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akrishna1995 akrishna1995 merged commit 45cdd70 into aws:master Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SageMaker JumpStart deployment to Asynchronous Endpoint generates a RuntimeError
6 participants