-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: adjacent slash in s3 key #3427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
self.sagemaker_session.context.runtime_repack_output_prefix = "s3://{}/{}".format( | ||
bucket, key_prefix | ||
self.sagemaker_session.context.runtime_repack_output_prefix = s3.s3_path_join( | ||
"s3://", bucket, key_prefix | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the s3.s3_path_join
will remove the "/" in the end of the path and return "s3://my-bucket/my-key-prefix".
But given the customer input in #3413
Removing the slash from the code location “works” (as in does not create an error message), but then I have no separation between the pipeline prefix and the artifact name, e.g. pipeline-codepytorchinference ...
Should we also check the estimator side on how it generates the entire code output path with the code_location
?
Or if this is already checked in the unit test that a separation is correctly added between the prefix and the artifact name, please let me know. Then we're good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus, if we decide to update the model side rather than the estimator side, as this only updates the tensorflow model, same changes are needed in the model class as well:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if this is already checked in the unit test that a separation is correctly added between the prefix and the artifact name, please let me know. Then we're good
yep, I verified in the unit test that for input code_location f's3://my-bucket/pipeline-code/'
and model-name pytorchinference
the s3 path will be s3://bucket/pipeline-code/pytorchinference/...
. Note that key_prefix
in the code above is actually pipeline-code//pytorchinference
. Basically this code change will convert the //
to /
same changes are needed in the model class as well
Yes, forgot to commit that class adding it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a second thought, how about we directly update the joins in _stage_user_code_in_s3
in estimator.py (see here)? The reasons are:
- even if we update the model side to remove the "/" at the end of
code_location
, if user supplies...code_location/
directly to the code_location to an estimator, thecode_location//some-key
would still persist in estimator code output. Though it may not cause any errors but it may not work as customer's expectation. - If update the estimator side, we can just update in one place and no need to update the model sides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming that by "artifact" the user means "model-name"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about we directly update the joins in _stage_user_code_in_s3 in estimator.py
Synced up offline, this may break backward compatibility. Given that we don't see any customer complains on the estimator's code location output path, we can just simply update the model side
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Codecov Report
@@ Coverage Diff @@
## master #3427 +/- ##
==========================================
- Coverage 89.17% 88.95% -0.23%
==========================================
Files 204 205 +1
Lines 18979 19070 +91
==========================================
+ Hits 16924 16963 +39
- Misses 2055 2107 +52
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot run slow-tests, notebook-tests
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot run slow-tests
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Co-authored-by: Namrata Madan <[email protected]>
Co-authored-by: Namrata Madan <[email protected]>
Co-authored-by: Namrata Madan <[email protected]>
Issue #, if available: #3413
Description of changes: Change S3 output_path to remove adjacent slashes
Testing done: unit testing
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.