Skip to content

Add SageMaker Elastic Inference test #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 17, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 48 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Don't forget the period at the end of the command!
Amazon Elastic Inference with MXNet in SageMaker
------------------------------------------------
`Amazon Elastic Inference <https://aws.amazon.com/machine-learning/elastic-inference/>`__ allows you to to attach
low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost running deep
low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost of running deep
learning inference by up to 75%. Currently, Amazon Elastic Inference supports TensorFlow, Apache MXNet, and ONNX
models, with more frameworks coming soon.

Expand Down Expand Up @@ -158,7 +158,7 @@ You Docker image must also be built in order to run the tests against it.

Local integration tests use the following pytest arguments:

- ``docker-base-name``: the Docker image's repository. Defaults to 'preprod-mxnet'.
- ``docker-base-name``: the Docker image's repository. Defaults to 'preprod-mxnet-serving'.
- ``framework-version``: the MXNet version. Defaults to the latest supported version.
- ``py-version``: the Python version. Defaults to '3'.
- ``processor``: CPU or GPU. Defaults to 'cpu'.
Expand All @@ -183,6 +183,51 @@ To run local integration tests:
--framework-version 1.4.0 \
--processor cpu

SageMaker Integration Tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~

SageMaker integration tests require your Docker image to be within an `Amazon ECR repository <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html>`__.

SageMaker integration tests use the following pytest arguments:

- ``docker-base-name``: the Docker image's `ECR repository namespace <https://docs.aws.amazon.com/AmazonECR/latest/userguide/Repositories.html>`__.
- ``framework-version``: the MXNet version. Defaults to the latest supported version.
- ``py-version``: the Python version. Defaults to '3'.
- ``processor``: CPU or GPU. Defaults to 'cpu'.
- ``tag``: the Docker image's tag. Defaults to <mxnet_version>-<processor>-py<py-version>
- ``aws-id``: your AWS account ID.
- ``instance-type``: the specified `Amazon SageMaker Instance Type <https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ that the tests will run on.
Defaults to 'ml.c4.xlarge' for CPU and 'ml.p2.xlarge' for GPU.

To run SageMaker integration tests:

::

tox test/integration/sagmaker -- --aws-id <your_aws_id> \
--docker-base-name <your_docker_image> \
--instance-type <amazon_sagemaker_instance_type> \
--tag <your_docker_image_tag> \

::

# Example
tox test/integration/sagemaker -- --aws-id 12345678910 \
--docker-base-name preprod-mxnet-serving \
--instance-type ml.m4.xlarge \
--tag 1.4.0-cpu-py3

If you want to run a SageMaker end to end test for your Elastic Inference container, you will need to provide an ``accelerator_type`` as an additional pytest argument.

The ``accelerator-type`` is your specified `Amazon Elastic Inference Accelerator <https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ type that will be attached to your instance type.

::

# Example for running Elastic Inference SageMaker end to end test
tox test/integration/sagemaker/test_elastic_inference.py -- --aws-id 12345678910 \
--docker-base-name preprod-mxnet-serving \
--instance-type ml.m4.xlarge \
--accelerator-type ml.eia1.medium \
--tag 1.0

Contributing
------------
Expand All @@ -195,4 +240,4 @@ License

SageMaker MXNet Containers is licensed under the Apache 2.0 License.
It is copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
The license is available at: http://aws.amazon.com/apache2.0/
The license is available at: http://aws.amazon.com/apache2.0/
4 changes: 4 additions & 0 deletions test/integration/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,7 @@
import os

RESOURCE_PATH = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', 'resources'))

# EI is currently only supported in the following regions
# regions were derived from https://aws.amazon.com/machine-learning/elastic-inference/pricing/
EI_SUPPORTED_REGIONS = ['us-east-1', 'us-east-2', 'us-west-2', 'eu-west-1', 'ap-northeast-1', 'ap-northeast-2']
Empty file.
64 changes: 64 additions & 0 deletions test/integration/sagemaker/test_elastic_inference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.
from __future__ import absolute_import

import os

import pytest
from sagemaker import utils
from sagemaker.mxnet import MXNetModel

from test.integration import EI_SUPPORTED_REGIONS, RESOURCE_PATH
from test.integration.sagemaker.timeout import timeout_and_delete_endpoint_by_name

DEFAULT_HANDLER_PATH = os.path.join(RESOURCE_PATH, 'default_handlers')
MODEL_PATH = os.path.join(DEFAULT_HANDLER_PATH, 'model.tar.gz')
SCRIPT_PATH = os.path.join(DEFAULT_HANDLER_PATH, 'model', 'code', 'empty_module.py')


@pytest.fixture(autouse=True)
def skip_if_no_accelerator(accelerator_type):
if accelerator_type is None:
pytest.skip('Skipping because accelerator type was not provided')


@pytest.fixture(autouse=True)
def skip_if_non_supported_ei_region(region):
if region not in EI_SUPPORTED_REGIONS:
pytest.skip('EI is not supported in {}'.format(region))


@pytest.mark.skip_if_non_supported_ei_region
@pytest.mark.skip_if_no_accelerator
def test_elastic_inference(ecr_image, sagemaker_session, instance_type, accelerator_type, framework_version):
endpoint_name = utils.unique_name_from_base('test-mxnet-ei')

with timeout_and_delete_endpoint_by_name(endpoint_name=endpoint_name,
sagemaker_session=sagemaker_session,
minutes=20):
prefix = 'mxnet-serving/default-handlers'
model_data = sagemaker_session.upload_data(path=MODEL_PATH, key_prefix=prefix)
model = MXNetModel(model_data=model_data,
entry_point=SCRIPT_PATH,
role='SageMakerRole',
image=ecr_image,
framework_version=framework_version,
sagemaker_session=sagemaker_session)

predictor = model.deploy(initial_instance_count=1,
instance_type=instance_type,
accelerator_type=accelerator_type,
endpoint_name=endpoint_name)

output = predictor.predict([[1, 2]])
assert [[4.9999918937683105]] == output
3 changes: 2 additions & 1 deletion test/integration/sagemaker/test_hosting.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,14 @@
SCRIPT_PATH = os.path.join(DEFAULT_HANDLER_PATH, 'model', 'code', 'empty_module.py')


def test_hosting(sagemaker_session, ecr_image, instance_type):
def test_hosting(sagemaker_session, ecr_image, instance_type, framework_version):
prefix = 'mxnet-serving/default-handlers'
model_data = sagemaker_session.upload_data(path=MODEL_PATH, key_prefix=prefix)
model = MXNetModel(model_data,
'SageMakerRole',
SCRIPT_PATH,
image=ecr_image,
framework_version=framework_version,
sagemaker_session=sagemaker_session)

endpoint_name = utils.unique_name_from_base('test-mxnet-serving')
Expand Down