Skip to content

Commit c52e893

Browse files
committed
doc: address merge conflicts
2 parents 62d81b5 + 5f14219 commit c52e893

File tree

7 files changed

+163
-6
lines changed

7 files changed

+163
-6
lines changed

CHANGELOG.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,28 @@
11
# Changelog
22

3+
## v1.35.0 (2019-07-31)
4+
5+
### Features
6+
7+
* allow serving image to be specified when calling MXNet.deploy
8+
9+
## v1.34.3 (2019-07-30)
10+
11+
### Bug fixes and other changes
12+
13+
* waiting for training tags to propagate in the test
14+
15+
## v1.34.2 (2019-07-29)
16+
17+
### Bug fixes and other changes
18+
19+
* removing unnecessary tests cases
20+
* Replaced generic ValueError with custom subclass when reporting unexpected resource status
21+
22+
### Documentation changes
23+
24+
* correct wording for Cloud9 environment setup instructions
25+
326
## v1.34.1 (2019-07-23)
427

528
### Bug fixes and other changes

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.34.2.dev0
1+
1.35.1.dev0

doc/using_mxnet.rst

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -801,7 +801,101 @@ The following are optional arguments. When you create an ``MXNet`` object, you c
801801
SageMaker MXNet Containers
802802
**************************
803803

804+
=======
805+
806+
Required arguments
807+
==================
808+
809+
The following are required arguments to the ``MXNet`` constructor. When you create an MXNet object, you must include these in the constructor, either positionally or as keyword arguments.
810+
811+
- ``entry_point`` Path (absolute or relative) to the Python file which
812+
should be executed as the entry point to training.
813+
- ``role`` An AWS IAM role (either name or full ARN). The Amazon
814+
SageMaker training jobs and APIs that create Amazon SageMaker
815+
endpoints use this role to access training data and model artifacts.
816+
After the endpoint is created, the inference code might use the IAM
817+
role, if accessing AWS resource.
818+
- ``train_instance_count`` Number of Amazon EC2 instances to use for
819+
training.
820+
- ``train_instance_type`` Type of EC2 instance to use for training, for
821+
example, 'ml.c4.xlarge'.
822+
823+
Optional arguments
824+
==================
825+
826+
The following are optional arguments. When you create an ``MXNet`` object, you can specify these as keyword arguments.
827+
828+
- ``source_dir`` Path (absolute or relative) to a directory with any
829+
other training source code dependencies including the entry point
830+
file. Structure within this directory will be preserved when training
831+
on SageMaker.
832+
- ``dependencies (list[str])`` A list of paths to directories (absolute or relative) with
833+
any additional libraries that will be exported to the container (default: ``[]``).
834+
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
835+
If the ``source_dir`` points to S3, code will be uploaded and the S3 location will be used
836+
instead. For example, the following call
837+
838+
>>> MXNet(entry_point='train.py', dependencies=['my/libs/common', 'virtual-env'])
839+
840+
results in the following inside the container:
841+
842+
.. code::
843+
844+
opt/ml/code
845+
├── train.py
846+
├── common
847+
└── virtual-env
848+
849+
- ``hyperparameters`` Hyperparameters that will be used for training.
850+
Will be made accessible as a dict[str, str] to the training code on
851+
SageMaker. For convenience, accepts other types besides str, but
852+
str() will be called on keys and values to convert them before
853+
training.
854+
- ``py_version`` Python version you want to use for executing your
855+
model training code. Valid values: 'py2' and 'py3'.
856+
- ``train_volume_size`` Size in GB of the EBS volume to use for storing
857+
input data during training. Must be large enough to store training
858+
data if input_mode='File' is used (which is the default).
859+
- ``train_max_run`` Timeout in seconds for training, after which Amazon
860+
SageMaker terminates the job regardless of its current status.
861+
- ``input_mode`` The input mode that the algorithm supports. Valid
862+
modes: 'File' - Amazon SageMaker copies the training dataset from the
863+
S3 location to a directory in the Docker container. 'Pipe' - Amazon
864+
SageMaker streams data directly from S3 to the container via a Unix
865+
named pipe.
866+
- ``output_path`` Location where you want the training result (model artifacts and optional output files) saved.
867+
This should be an S3 location unless you're using Local Mode, which also supports local output paths.
868+
If not specified, results are stored to a default S3 bucket.
869+
- ``output_kms_key`` Optional KMS key ID to optionally encrypt training
870+
output with.
871+
- ``job_name`` Name to assign for the training job that the fit()
872+
method launches. If not specified, the estimator generates a default
873+
job name, based on the training image name and current timestamp
874+
- ``image_name`` An alternative docker image to use for training and
875+
serving. If specified, the estimator will use this image for training and
876+
hosting, instead of selecting the appropriate SageMaker official image based on
877+
framework_version and py_version. Refer to: `SageMaker MXNet Docker Containers
878+
<#sagemaker-mxnet-docker-containers>`_ for details on what the Official images support
879+
and where to find the source code to build your custom image.
880+
- ``distributions`` For versions 1.3 and above only.
881+
Specifies information for how to run distributed training.
882+
To launch a parameter server during training, set this argument to:
883+
884+
.. code::
885+
886+
{
887+
'parameter_server': {
888+
'enabled': True
889+
}
890+
}
891+
892+
**************************
893+
SageMaker MXNet Containers
894+
**************************
895+
804896
For information about SageMaker MXNet containers, see the following topics:
805897

806898
- training: https://github.com/aws/sagemaker-mxnet-container
807899
- serving: https://github.com/aws/sagemaker-mxnet-serving-container
900+
901+
For information about the dependencies installed in SageMaker MXNet containers, see https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/mxnet/README.rst#sagemaker-mxnet-containers.

doc/using_sklearn.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,16 @@ inadvertently run your training code at the wrong point in execution.
9797

9898
For more on training environment variables, please visit https://github.com/aws/sagemaker-containers.
9999

100+
Using third-party libraries
101+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
102+
103+
When running your training script on SageMaker, it will have access to some pre-installed third-party libraries including ``scikit-learn``, ``numpy``, and ``pandas``.
104+
For more information on the runtime environment, including specific package versions, see `SageMaker Scikit-learn Docker containers <https://github.com/aws/sagemaker-scikit-learn-container>`__.
105+
106+
If there are other packages you want to use with your script, you can include a ``requirements.txte` file in the same directory as your training script to install other dependencies at runtime.
107+
A ``requirements.txt`` file is a text file that contains a list of items that are installed by using ``pip install``. You can also specify the version of an item to install.
108+
For information about the format of a ``requirements.txt`` file, see `Requirements Files <https://pip.pypa.io/en/stable/user_guide/#requirements-files>`__ in the pip documentation.
109+
100110
Running a Scikit-learn training script in SageMaker
101111
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
102112

src/sagemaker/mxnet/estimator.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ def create_model(
140140
entry_point=None,
141141
source_dir=None,
142142
dependencies=None,
143+
image_name=None,
143144
):
144145
"""Create a SageMaker ``MXNetModel`` object that can be deployed to an
145146
``Endpoint``.
@@ -164,6 +165,12 @@ def create_model(
164165
dependencies (list[str]): A list of paths to directories (absolute or relative) with
165166
any additional libraries that will be exported to the container.
166167
If not specified, the dependencies from training are used.
168+
image_name (str): If specified, the estimator will use this image for hosting, instead
169+
of selecting the appropriate SageMaker official image based on framework_version
170+
and py_version. It can be an ECR url or dockerhub image and tag.
171+
Examples:
172+
123.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0
173+
custom-image:latest.
167174
168175
Returns:
169176
sagemaker.mxnet.model.MXNetModel: A SageMaker ``MXNetModel`` object.
@@ -180,7 +187,7 @@ def create_model(
180187
code_location=self.code_location,
181188
py_version=self.py_version,
182189
framework_version=self.framework_version,
183-
image=self.image_name,
190+
image=(image_name or self.image_name),
184191
model_server_workers=model_server_workers,
185192
sagemaker_session=self.sagemaker_session,
186193
vpc_config=self.get_vpc_config(vpc_config_override),

tests/integ/test_tf_script_mode.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -146,14 +146,14 @@ def test_mnist_async(sagemaker_session):
146146
training_job_name = estimator.latest_training_job.name
147147
time.sleep(20)
148148
endpoint_name = training_job_name
149-
model_name = "model-name-1"
150149
_assert_training_job_tags_match(
151150
sagemaker_session.sagemaker_client, estimator.latest_training_job.name, TAGS
152151
)
153152
with tests.integ.timeout.timeout_and_delete_endpoint_by_name(endpoint_name, sagemaker_session):
154153
estimator = TensorFlow.attach(
155154
training_job_name=training_job_name, sagemaker_session=sagemaker_session
156155
)
156+
model_name = "model-mnist-async"
157157
predictor = estimator.deploy(
158158
initial_instance_count=1,
159159
instance_type="ml.c4.xlarge",
@@ -215,14 +215,14 @@ def _assert_s3_files_exist(s3_url, files):
215215
raise ValueError("File {} is not found under {}".format(f, s3_url))
216216

217217

218-
def _assert_tags_match(sagemaker_client, resource_arn, tags, retries=1):
218+
def _assert_tags_match(sagemaker_client, resource_arn, tags, retries=15):
219219
actual_tags = None
220220
for _ in range(retries):
221221
actual_tags = sagemaker_client.list_tags(ResourceArn=resource_arn)["Tags"]
222222
if actual_tags:
223223
break
224224
else:
225-
# endpoint tags might take minutes to propagate. Sleeping.
225+
# endpoint and training tags might take minutes to propagate. Sleeping.
226226
time.sleep(30)
227227
assert actual_tags == tags
228228

@@ -235,7 +235,7 @@ def _assert_model_tags_match(sagemaker_client, model_name, tags):
235235
def _assert_endpoint_tags_match(sagemaker_client, endpoint_name, tags):
236236
endpoint_description = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
237237

238-
_assert_tags_match(sagemaker_client, endpoint_description["EndpointArn"], tags, retries=10)
238+
_assert_tags_match(sagemaker_client, endpoint_description["EndpointArn"], tags)
239239

240240

241241
def _assert_training_job_tags_match(sagemaker_client, training_job_name, tags):

tests/unit/test_mxnet.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -704,3 +704,26 @@ def test_empty_framework_version(warning, sagemaker_session):
704704

705705
assert mx.framework_version == defaults.MXNET_VERSION
706706
warning.assert_called_with(defaults.MXNET_VERSION, mx.LATEST_VERSION)
707+
708+
709+
def test_create_model_with_custom_hosting_image(sagemaker_session):
710+
container_log_level = '"logging.INFO"'
711+
source_dir = "s3://mybucket/source"
712+
custom_image = "mxnet:2.0"
713+
custom_hosting_image = "mxnet_hosting:2.0"
714+
mx = MXNet(
715+
entry_point=SCRIPT_PATH,
716+
role=ROLE,
717+
sagemaker_session=sagemaker_session,
718+
train_instance_count=INSTANCE_COUNT,
719+
train_instance_type=INSTANCE_TYPE,
720+
image_name=custom_image,
721+
container_log_level=container_log_level,
722+
base_job_name="job",
723+
source_dir=source_dir,
724+
)
725+
726+
mx.fit(inputs="s3://mybucket/train", job_name="new_name")
727+
model = mx.create_model(image_name=custom_hosting_image)
728+
729+
assert model.image == custom_hosting_image

0 commit comments

Comments
 (0)