Skip to content

Add warning if framework_version is not set #431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ CHANGELOG
* feature: Add timestamp to secondary status in training job output
* bug-fix: Local Mode: Set correct default values for additional_volumes and additional_env_vars
* enhancement: Local Mode: support nvidia-docker2 natively
* warning: Frameworks: add warning for upcoming breaking change that makes framework_version required

1.11.2
======
Expand Down
22 changes: 13 additions & 9 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,9 +153,10 @@ Here is an end to end example of how to use a SageMaker Estimator:

# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
role="SageMakerRole",
role='SageMakerRole',
train_instance_type='ml.p2.xlarge',
train_instance_count = 1)
train_instance_count=1,
framework_version='1.2.1')

# Starts a SageMaker training job and waits until completion.
mxnet_estimator.fit('s3://my_bucket/my_training_data/')
Expand Down Expand Up @@ -183,9 +184,10 @@ We can take the example in `Using Estimators <#using-estimators>`__ , and use e

# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
role="SageMakerRole",
role='SageMakerRole',
train_instance_type='local',
train_instance_count=1)
train_instance_count=1,
framework_version='1.2.1')

# In Local Mode, fit will pull the MXNet container Docker image and run it locally
mxnet_estimator.fit('s3://my_bucket/my_training_data/')
Expand Down Expand Up @@ -239,7 +241,8 @@ Here is an end-to-end example:

mxnet_estimator = MXNet('train.py',
train_instance_type='local',
train_instance_count=1)
train_instance_count=1,
framework_version='1.2.1')

mxnet_estimator.fit('file:///tmp/my_training_data')
transformer = mxnet_estimator.transformer(1, 'local', assemble_with='Line', max_payload=1)
Expand Down Expand Up @@ -504,10 +507,11 @@ To train a model using your own VPC, set the optional parameters ``subnets`` and

# Configure an MXNet Estimator with subnets and security groups from your VPC
mxnet_vpc_estimator = MXNet('train.py',
train_instance_type='ml.p2.xlarge',
train_instance_count = 1,
subnets=['subnet-1', 'subnet-2'],
security_group_ids=['sg-1'])
train_instance_type='ml.p2.xlarge',
train_instance_count=1,
framework_version='1.2.1',
subnets=['subnet-1', 'subnet-2'],
security_group_ids=['sg-1'])

# SageMaker Training Job will set VpcConfig and container instances will run in your VPC
mxnet_vpc_estimator.fit('s3://my_bucket/my_training_data/')
Expand Down
33 changes: 18 additions & 15 deletions src/sagemaker/chainer/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,12 @@ Suppose that you already have an Chainer training script called
.. code:: python

from sagemaker.chainer import Chainer
chainer_estimator = Chainer(entry_point="chainer-train.py",
role="SageMakerRole",
train_instance_type="ml.p3.2xlarge",
train_instance_count=1)
chainer_estimator.fit("s3://bucket/path/to/training/data")
chainer_estimator = Chainer(entry_point='chainer-train.py',
role='SageMakerRole',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1,
framework_version='4.1.0')
chainer_estimator.fit('s3://bucket/path/to/training/data')

Where the S3 URL is a path to your training data, within Amazon S3. The constructor keyword arguments define how
SageMaker runs your training script and are discussed in detail in a later section.
Expand Down Expand Up @@ -107,12 +108,13 @@ directories ('train' and 'test').

.. code:: python

chainer_estimator = Chainer("chainer-train.py",
train_instance_type="ml.p3.2xlarge",
train_instance_count=1,
hyperparameters = {'epochs': 20, 'batch-size': 64, 'learning-rate':0.1})
chainer_estimator = Chainer('chainer-train.py',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1,
framework_version='4.1.0',
hyperparameters = {'epochs': 20, 'batch-size': 64, 'learning-rate': 0.1})
chainer_estimator.fit({'train': 's3://my-data-bucket/path/to/my/training/data',
'test': 's3://my-data-bucket/path/to/my/test/data'})
'test': 's3://my-data-bucket/path/to/my/test/data'})


Chainer Estimators
Expand Down Expand Up @@ -280,13 +282,14 @@ operation.
.. code:: python

# Train my estimator
chainer_estimator = Chainer(entry_point="train_and_deploy.py",
train_instance_type="ml.p3.2xlarge",
train_instance_count=1)
chainer_estimator.fit("s3://my_bucket/my_training_data/")
chainer_estimator = Chainer(entry_point='train_and_deploy.py',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1,
framework_version='4.1.0')
chainer_estimator.fit('s3://my_bucket/my_training_data/')

# Deploy my estimator to a SageMaker Endpoint and get a Predictor
predictor = chainer_estimator.deploy(instance_type="ml.m4.xlarge",
predictor = chainer_estimator.deploy(instance_type='ml.m4.xlarge',
initial_instance_count=1)

# `data` is a NumPy array or a Python list.
Expand Down
14 changes: 11 additions & 3 deletions src/sagemaker/chainer/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,17 @@
# language governing permissions and limitations under the License.
from __future__ import absolute_import

import logging

from sagemaker.estimator import Framework
from sagemaker.fw_utils import framework_name_from_image, framework_version_from_tag
from sagemaker.fw_utils import framework_name_from_image, framework_version_from_tag, empty_framework_version_warning
from sagemaker.chainer.defaults import CHAINER_VERSION
from sagemaker.chainer.model import ChainerModel
from sagemaker.vpc_utils import VPC_CONFIG_DEFAULT

logging.basicConfig()
logger = logging.getLogger('sagemaker')


class Chainer(Framework):
"""Handle end-to-end training and deployment of custom Chainer code."""
Expand All @@ -32,7 +37,7 @@ class Chainer(Framework):

def __init__(self, entry_point, use_mpi=None, num_processes=None, process_slots_per_host=None,
additional_mpi_options=None, source_dir=None, hyperparameters=None, py_version='py3',
framework_version=CHAINER_VERSION, image_name=None, **kwargs):
framework_version=None, image_name=None, **kwargs):
"""
This ``Estimator`` executes an Chainer script in a managed Chainer execution environment, within a SageMaker
Training Job. The managed Chainer environment is an Amazon-built Docker container that executes functions
Expand Down Expand Up @@ -79,12 +84,15 @@ def __init__(self, entry_point, use_mpi=None, num_processes=None, process_slots_
super(Chainer, self).__init__(entry_point, source_dir, hyperparameters,
image_name=image_name, **kwargs)
self.py_version = py_version
self.framework_version = framework_version
self.use_mpi = use_mpi
self.num_processes = num_processes
self.process_slots_per_host = process_slots_per_host
self.additional_mpi_options = additional_mpi_options

if framework_version is None:
logger.warning(empty_framework_version_warning(CHAINER_VERSION))
self.framework_version = framework_version or CHAINER_VERSION

def hyperparameters(self):
"""Return hyperparameters used by your custom Chainer code during training."""
hyperparameters = super(Chainer, self).hyperparameters()
Expand Down
10 changes: 9 additions & 1 deletion src/sagemaker/fw_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,18 @@

"""This module contains utility functions shared across ``Framework`` components."""


UploadedCode = namedtuple('UserCode', ['s3_prefix', 'script_name'])
"""sagemaker.fw_utils.UserCode: An object containing the S3 prefix and script name.

This is for the source code used for the entry point with an ``Estimator``. It can be
instantiated with positional or keyword arguments.
"""

EMPTY_FRAMEWORK_VERSION_WARNING = 'In an upcoming version of the SageMaker Python SDK, ' \
'framework_version will be required to create an estimator. ' \
'Please add framework_version={} to your constructor to avoid ' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to the constructor instead of to your constructor?

Doesn't really matter though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "your" makes it more clear we're referring to the user's code

'an error in the future.'


def create_image_uri(region, framework, instance_type, framework_version, py_version, account='520713654638',
optimized_families=[]):
Expand Down Expand Up @@ -223,3 +227,7 @@ def model_code_key_prefix(code_location_key_prefix, model_name, image):
str: the key prefix to be used in uploading code
"""
return '/'.join(filter(None, [code_location_key_prefix, model_name or name_from_image(image)]))


def empty_framework_version_warning(default_version):
return EMPTY_FRAMEWORK_VERSION_WARNING.format(default_version)
29 changes: 16 additions & 13 deletions src/sagemaker/mxnet/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,12 @@ Suppose that you already have an MXNet training script called
.. code:: python

from sagemaker.mxnet import MXNet
mxnet_estimator = MXNet("mxnet-train.py",
role="SageMakerRole",
train_instance_type="ml.p3.2xlarge",
train_instance_count=1)
mxnet_estimator.fit("s3://bucket/path/to/training/data")
mxnet_estimator = MXNet('mxnet-train.py',
role='SageMakerRole',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1,
framework_version='1.2.1')
mxnet_estimator.fit('s3://bucket/path/to/training/data')

Where the s3 url is a path to your training data, within Amazon S3. The constructor keyword arguments define how SageMaker runs your training script and are discussed, in detail, in a later section.

Expand Down Expand Up @@ -97,10 +98,11 @@ You run MXNet training scripts on SageMaker by creating ``MXNet`` Estimators. Sa

.. code:: python

mxnet_estimator = MXNet("train.py",
train_instance_type="ml.p2.xlarge",
train_instance_count=1)
mxnet_estimator.fit("s3://my_bucket/my_training_data/")
mxnet_estimator = MXNet('train.py',
train_instance_type='ml.p2.xlarge',
train_instance_count=1,
framework_version='1.2.1')
mxnet_estimator.fit('s3://my_bucket/my_training_data/')

MXNet Estimators
^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -302,10 +304,11 @@ After calling ``fit``, you can call ``deploy`` on an ``MXNet`` Estimator to crea
.. code:: python

# Train my estimator
mxnet_estimator = MXNet("train.py",
train_instance_type="ml.p2.xlarge",
train_instance_count=1)
mxnet_estimator.fit("s3://my_bucket/my_training_data/")
mxnet_estimator = MXNet('train.py',
train_instance_type='ml.p2.xlarge',
train_instance_count=1,
framework_version='1.2.1')
mxnet_estimator.fit('s3://my_bucket/my_training_data/')

# Deploy my estimator to a SageMaker Endpoint and get a Predictor
predictor = mxnet_estimator.deploy(instance_type='ml.m4.xlarge',
Expand Down
14 changes: 11 additions & 3 deletions src/sagemaker/mxnet/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,25 @@
# language governing permissions and limitations under the License.
from __future__ import absolute_import

import logging

from sagemaker.estimator import Framework
from sagemaker.fw_utils import framework_name_from_image, framework_version_from_tag
from sagemaker.fw_utils import framework_name_from_image, framework_version_from_tag, empty_framework_version_warning
from sagemaker.mxnet.defaults import MXNET_VERSION
from sagemaker.mxnet.model import MXNetModel
from sagemaker.vpc_utils import VPC_CONFIG_DEFAULT

logging.basicConfig()
logger = logging.getLogger('sagemaker')


class MXNet(Framework):
"""Handle end-to-end training and deployment of custom MXNet code."""

__framework_name__ = "mxnet"

def __init__(self, entry_point, source_dir=None, hyperparameters=None, py_version='py2',
framework_version=MXNET_VERSION, image_name=None, **kwargs):
framework_version=None, image_name=None, **kwargs):
"""
This ``Estimator`` executes an MXNet script in a managed MXNet execution environment, within a SageMaker
Training Job. The managed MXNet environment is an Amazon-built Docker container that executes functions
Expand Down Expand Up @@ -64,7 +69,10 @@ def __init__(self, entry_point, source_dir=None, hyperparameters=None, py_versio
super(MXNet, self).__init__(entry_point, source_dir, hyperparameters,
image_name=image_name, **kwargs)
self.py_version = py_version
self.framework_version = framework_version

if framework_version is None:
logger.warning(empty_framework_version_warning(MXNET_VERSION))
self.framework_version = framework_version or MXNET_VERSION

def create_model(self, model_server_workers=None, role=None, vpc_config_override=VPC_CONFIG_DEFAULT):
"""Create a SageMaker ``MXNetModel`` object that can be deployed to an ``Endpoint``.
Expand Down
11 changes: 7 additions & 4 deletions src/sagemaker/pytorch/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,8 @@ You can then setup a ``PyTorch`` Estimator with keyword arguments to point to th
pytorch_estimator = PyTorch(entry_point='pytorch-train.py',
role='SageMakerRole',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1)
train_instance_count=1,
framework_version='0.4.0')

After that, you simply tell the estimator to start a training job and provide an S3 URL
that is the path to your training data within Amazon S3:
Expand Down Expand Up @@ -136,9 +137,10 @@ directories ('train' and 'test').
pytorch_estimator = PyTorch('pytorch-train.py',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1,
hyperparameters = {'epochs': 20, 'batch-size': 64, 'learning-rate':0.1})
framework_version='0.4.0',
hyperparameters = {'epochs': 20, 'batch-size': 64, 'learning-rate': 0.1})
pytorch_estimator.fit({'train': 's3://my-data-bucket/path/to/my/training/data',
'test': 's3://my-data-bucket/path/to/my/test/data'})
'test': 's3://my-data-bucket/path/to/my/test/data'})


PyTorch Estimators
Expand Down Expand Up @@ -318,7 +320,8 @@ operation.
# Train my estimator
pytorch_estimator = PyTorch(entry_point='train_and_deploy.py',
train_instance_type='ml.p3.2xlarge',
train_instance_count=1)
train_instance_count=1,
framework_version='0.4.0')
pytorch_estimator.fit('s3://my_bucket/my_training_data/')

# Deploy my estimator to a SageMaker Endpoint and get a Predictor
Expand Down
15 changes: 12 additions & 3 deletions src/sagemaker/pytorch/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,26 @@
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
from __future__ import absolute_import

import logging

from sagemaker.estimator import Framework
from sagemaker.fw_utils import framework_name_from_image, framework_version_from_tag
from sagemaker.fw_utils import framework_name_from_image, framework_version_from_tag, empty_framework_version_warning
from sagemaker.pytorch.defaults import PYTORCH_VERSION, PYTHON_VERSION
from sagemaker.pytorch.model import PyTorchModel
from sagemaker.vpc_utils import VPC_CONFIG_DEFAULT

logging.basicConfig()
logger = logging.getLogger('sagemaker')


class PyTorch(Framework):
"""Handle end-to-end training and deployment of custom PyTorch code."""

__framework_name__ = "pytorch"

def __init__(self, entry_point, source_dir=None, hyperparameters=None, py_version=PYTHON_VERSION,
framework_version=PYTORCH_VERSION, image_name=None, **kwargs):
framework_version=None, image_name=None, **kwargs):
"""
This ``Estimator`` executes an PyTorch script in a managed PyTorch execution environment, within a SageMaker
Training Job. The managed PyTorch environment is an Amazon-built Docker container that executes functions
Expand Down Expand Up @@ -62,7 +68,10 @@ def __init__(self, entry_point, source_dir=None, hyperparameters=None, py_versio
"""
super(PyTorch, self).__init__(entry_point, source_dir, hyperparameters, image_name=image_name, **kwargs)
self.py_version = py_version
self.framework_version = framework_version

if framework_version is None:
logger.warning(empty_framework_version_warning(PYTORCH_VERSION))
self.framework_version = framework_version or PYTORCH_VERSION

def create_model(self, model_server_workers=None, role=None, vpc_config_override=VPC_CONFIG_DEFAULT):
"""Create a SageMaker ``PyTorchModel`` object that can be deployed to an ``Endpoint``.
Expand Down
Loading