-
Notifications
You must be signed in to change notification settings - Fork 1.2k
add tfs container support #460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #460 +/- ##
==========================================
+ Coverage 93.77% 93.86% +0.08%
==========================================
Files 55 56 +1
Lines 4114 4190 +76
==========================================
+ Hits 3858 3933 +75
- Misses 256 257 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider including Readme updates as well.
src/sagemaker/predictor.py
Outdated
"""Return the inference from the specified endpoint. | ||
|
||
Args: | ||
data (object): Input data for which you want the model to provide inference. | ||
If a serializer was specified when creating the RealTimePredictor, the result of the | ||
serializer is sent as input data. Otherwise the data must be sequence of bytes, and | ||
the predict method then sends the bytes in the request body as is. | ||
initial_args (dict[str,str]): Optional. Initial request arguments. Default is None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clarify more this docstring argument.
src/sagemaker/tensorflow/__init__.py
Outdated
@@ -29,3 +29,4 @@ | |||
|
|||
from sagemaker.tensorflow.estimator import TensorFlow # noqa: E402, F401 | |||
from sagemaker.tensorflow.model import TensorFlowModel, TensorFlowPredictor # noqa: E402, F401 | |||
from sagemaker.tensorflow.tfs import TFSModel, TFSPredictor # noqa: E402, F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid acronyms to keep the same pattern of the SDK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'TensorFlowServingModel' is unwieldy, and ambiguous since TensorFlowModel also refers to a kind of TensorFlow Serving model. TFS is well aligned with the module name, and the container side package name.
Another option... just calling them Model and Predictor (sagemaker.tensorflow.tfs.Model etc), but this invites collision with sagemaker.Model etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about sagemaker.tesorflow.serving.Model or sagemaker.tensorflow_serving.Model?
Collision should not happen if we are importing modules not classes like specified in the style guide.
CHANGELOG.rst
Outdated
1.14.0-dev | ||
========== | ||
|
||
* add support for sagemaker-tfs container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow correct changelog format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
name=self._current_job_name, | ||
framework_version=self.framework_version, | ||
sagemaker_session=self.sagemaker_session, | ||
vpc_config=self.get_vpc_config(vpc_config_override)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you missing container_log_level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
src/sagemaker/tensorflow/tfs.py
Outdated
from sagemaker import Model, RealTimePredictor | ||
from sagemaker.content_types import CONTENT_TYPE_JSON | ||
from sagemaker.fw_utils import create_image_uri | ||
from sagemaker.predictor import json_serializer, json_deserializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: from google python style guide -> import modules not functions and classes.
src/sagemaker/tensorflow/tfs.py
Outdated
attributes.append('tfs-model-name={}'.format(model_name)) | ||
if model_version: | ||
attributes.append('tfs-model-version={}'.format(model_version)) | ||
self._model_attributes = ','.join(attributes) if attributes else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use an empty array to represent an empty list of model attributes instead of None?
src/sagemaker/tensorflow/tfs.py
Outdated
|
||
|
||
class TFSModel(Model): | ||
FRAMEWORK_NAME = 'tfs' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do believe that other classes use dunders instead of a constant, although I Iike your style better and matches the conventions https://github.com/google/styleguide/blob/gh-pages/pyguide.md#3164-guidelines-derived-from-guidos-recommendations
nit: consider leaving it at module level
src/sagemaker/tensorflow/tfs.py
Outdated
logging.WARNING: 'warn', | ||
logging.ERROR: 'error', | ||
logging.CRITICAL: 'crit', | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
src/sagemaker/tensorflow/tfs.py
Outdated
logging.WARNING: 'warn', | ||
logging.ERROR: 'error', | ||
logging.CRITICAL: 'crit', | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dictionaries are not constants in Python.
src/sagemaker/tensorflow/tfs.py
Outdated
return self.env | ||
|
||
env = dict(self.env) | ||
env['SAGEMAKER_TFS_NGINX_LOGLEVEL'] = TFSModel.LOG_LEVEL_MAP[self._container_log_level] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use a constant for this key.
src/sagemaker/tensorflow/tfs.py
Outdated
# reuse standard image uri function, then strip unwanted python component | ||
region_name = self.sagemaker_session.boto_region_name | ||
image = create_image_uri(region_name, TFSModel.FRAMEWORK_NAME, instance_type, | ||
self._framework_version, 'py3') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard is to change create image uri instead of reusing and striping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of fussy behavior in create_image_uri. it would be easy to fall out of sync
with timeout_and_delete_endpoint_by_name(endpoint_name, sagemaker_session): | ||
model = TFSModel(model_data, 'SageMakerRole', framework_version=tf_full_version) | ||
predictor = model.deploy(1, 'ml.c5.xlarge', endpoint_name=endpoint_name) | ||
yield predictor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test setup
serializer=sagemaker.predictor.csv_serializer) | ||
|
||
result = predictor.predict(input_data) | ||
assert expected_result == result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are missing test cases for errors and empty requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- handled in unit tests and in container package tests
- also this is generic predictor / serializer behavior that is well covered by other tests
Description of changes:
add support for sagemaker-tfs inference container
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.