Skip to content

Add EI Dockerfile for 1.11 #163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 12, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 22 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,16 +134,15 @@ Then run:
# All build instructions assumes you're building from the same directory as the Dockerfile.

# CPU
docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .
docker build -t <image_name>:<tag> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .

# GPU
docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .
docker build -t <image_name>:<tag> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .

::

# Example
docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg py_version=2 \
--build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .

The dockerfiles for 1.4 and 1.5 build from source instead, so when building those, you don't need to download the wheel beforehand:

Expand Down Expand Up @@ -188,7 +187,7 @@ versions of the frameworks are automatically built into containers when you use
download them as binary files and import them into your own Docker containers. The enhanced TensorFlow serving binaries are available on Amazon S3 at https://s3.console.aws.amazon.com/s3/buckets/amazonei-tensorflow.

The SageMaker TensorFlow containers with Amazon Elastic Inference support were built from the
`EI Dockerfile <https://github.com/aws/sagemaker-tensorflow-container/blob/master/docker/1.12.0/final/py2/Dockerfile.ei>`__ starting at TensorFlow 1.12.0 and above.
`EI Dockerfile <https://github.com/aws/sagemaker-tensorflow-container/blob/master/docker/1.11.0/final/py2/Dockerfile.ei>`__ starting at TensorFlow 1.11.0 and above.

The instructions for building the SageMaker TensorFlow containers with Amazon Elastic Inference support are similar to the steps `above <https://github.com/aws/sagemaker-tensorflow-container#final-images>`__.

Expand All @@ -197,9 +196,9 @@ The only difference is the addition of the ``tensorflow_model_server`` build-arg
::

# Example
docker build -t preprod-tensorflow-ei:1.12.0-cpu-py2 --build-arg py_version=2 \
--build-arg tensorflow_model_server AmazonEI_TensorFlow_Serving_v1.12_v1 \
--build-arg framework_installable=tensorflow-1.12.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
docker build -t preprod-tensorflow-ei:1.11.0-cpu-py2 \
--build-arg tensorflow_model_server AmazonEI_TensorFlow_Serving_v1.11_v1 \
--build-arg framework_installable=tensorflow-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .


* For information about downloading the enhanced versions of TensorFlow serving, see `Using TensorFlow Models with Amazon EI <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ei-tensorflow.html>`__.
Expand Down Expand Up @@ -273,10 +272,10 @@ Functional Tests
Functional tests require your Docker image to be within an `Amazon ECR repository <https://docs
.aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html>`__.

The Docker-base-name is your `ECR repository namespace <https://docs.aws.amazon
The `docker-base-name` is your `ECR repository namespace <https://docs.aws.amazon
.com/AmazonECR/latest/userguide/Repositories.html>`__.

The instance-type is your specified `Amazon SageMaker Instance Type
The `instance-type` is your specified `Amazon SageMaker Instance Type
<https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ that the functional test will run on.


Expand All @@ -292,7 +291,6 @@ SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
::

# Required arguments for integration tests are found in test/functional/conftest.py

pytest test/functional --aws-id <your_aws_id> \
--docker-base-name <your_docker_image> \
--instance-type <amazon_sagemaker_instance_type> \
Expand All @@ -306,6 +304,19 @@ SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
--instance-type ml.m4.xlarge \
--tag 1.0

If you want to run a functional end to end test for your Elastic Inference container, you will need to provide an `accelerator_type` as an additional pytest argument.

The `accelerator-type` is your specified `Amazon Elastic Inference Accelerator <https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ type that will be attached to your instance type.

::

# Example for running Elastic Inference functional test
pytest test/functional/test_elastic_inference.py --aws-id 12345678910 \
--docker-base-name preprod-tensorflow \
--instance-type ml.m4.xlarge \
--accelerator-type ml.eia1.medium \
--tag 1.0

Contributing
------------

Expand Down
80 changes: 80 additions & 0 deletions docker/1.11.0/final/py2/Dockerfile.ei
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
FROM ubuntu:16.04

MAINTAINER Amazon AI

ARG framework_installable
ARG framework_support_installable=sagemaker_tensorflow_container-1.0.0.tar.gz
ARG tensorflow_model_server

WORKDIR /root

COPY $framework_installable .
COPY $framework_support_installable .

RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
curl \
git \
libcurl3-dev \
libfreetype6-dev \
libpng12-dev \
libzmq3-dev \
pkg-config \
python-dev \
rsync \
software-properties-common \
unzip \
zip \
zlib1g-dev \
openjdk-8-jdk \
openjdk-8-jre-headless \
wget \
vim \
iputils-ping \
nginx \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py && \
rm get-pip.py

RUN pip --no-cache-dir install \
numpy \
scipy \
sklearn \
pandas \
Pillow \
h5py

# TODO: upgrade to tf serving 1.8, which requires more work with updating
# dependencies. See current work in progress in tfserving-1.8 branch.
ENV TF_SERVING_VERSION=1.7.0

RUN pip install numpy boto3 six awscli flask==0.11 Jinja2==2.9 tensorflow-serving-api==$TF_SERVING_VERSION gevent gunicorn

# Install TF Serving pkg
COPY $tensorflow_model_server /usr/bin/tensorflow_model_server

# Update libstdc++6, as required by tensorflow-serving >= 1.6: https://github.com/tensorflow/serving/issues/819
RUN add-apt-repository ppa:ubuntu-toolchain-r/test -y && \
apt-get update && \
apt-get install -y libstdc++6

RUN framework_installable_local=$(basename $framework_installable) && \
framework_support_installable_local=$(basename $framework_support_installable) && \
\
pip install --no-cache --upgrade $framework_installable_local && \
pip install $framework_support_installable_local && \
pip install "sagemaker-tensorflow>=1.10,<1.11" &&\
\
rm $framework_installable_local && \
rm $framework_support_installable_local

# Set environment variables for MKL
# TODO: investigate the right value for OMP_NUM_THREADS
ENV KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=1 KMP_SETTINGS=0

# entry.py comes from sagemaker-container-support
ENTRYPOINT ["entry.py"]
3 changes: 1 addition & 2 deletions src/tf_container/serve.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,8 +259,7 @@ def _default_input_fn(self, serialized_data, content_type):

@classmethod
def from_module(cls, m, grpc_proxy_client):
"""Initialize a Transformer using functions supplied by the given module. The module
must supply a ``model_fn()`` that returns an MXNet Module.
"""Initialize a Transformer using functions supplied by the given module.

If the module contains a ``transform_fn``, it will be used to handle incoming request
data, execute the model prediction, and generation of response content.
Expand Down
15 changes: 9 additions & 6 deletions test/functional/__init__.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

# EI is currently only supported in the following regions
# regions were derived from https://aws.amazon.com/machine-learning/elastic-inference/pricing/
EI_SUPPORTED_REGIONS = ['us-east-1', 'us-east-2', 'us-west-2', 'eu-west-1', 'ap-northeast-1', 'ap-northeast-2']
18 changes: 12 additions & 6 deletions test/functional/conftest.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

import logging
Expand All @@ -29,6 +29,7 @@ def pytest_addoption(parser):
parser.addoption('--aws-id')
parser.addoption('--docker-base-name', default='preprod-tensorflow')
parser.addoption('--instance-type')
parser.addoption('--accelerator-type', default=None)
parser.addoption('--region', default='us-west-2')
parser.addoption('--tag')

Expand All @@ -48,6 +49,11 @@ def instance_type(request):
return request.config.getoption('--instance-type')


@pytest.fixture(scope='session')
def accelerator_type(request):
return request.config.getoption('--accelerator-type')


@pytest.fixture(scope='session')
def region(request):
return request.config.getoption('--region')
Expand Down
78 changes: 78 additions & 0 deletions test/functional/test_elastic_inference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

import logging
import os

import numpy as np
import pytest
from sagemaker.tensorflow import TensorFlowModel
from sagemaker.utils import sagemaker_timestamp

from test.functional import EI_SUPPORTED_REGIONS
from test.integ.conftest import SCRIPT_PATH
from test.resources.python_sdk.timeout import timeout_and_delete_endpoint_by_name

logger = logging.getLogger(__name__)
logging.getLogger('boto3').setLevel(logging.INFO)
logging.getLogger('botocore').setLevel(logging.INFO)
logging.getLogger('factory.py').setLevel(logging.INFO)
logging.getLogger('auth.py').setLevel(logging.INFO)
logging.getLogger('connectionpool.py').setLevel(logging.INFO)
logging.getLogger('session.py').setLevel(logging.DEBUG)
logging.getLogger('sagemaker').setLevel(logging.DEBUG)


@pytest.fixture(autouse=True)
def skip_if_no_accelerator(accelerator_type):
if accelerator_type is None:
pytest.skip('Skipping because accelerator type was not provided')


@pytest.fixture(autouse=True)
def skip_if_non_supported_ei_region(region):
if region not in EI_SUPPORTED_REGIONS:
pytest.skip('EI is not supported in {}'.format(region))


@pytest.fixture
def pretrained_model_data(region):
return 's3://sagemaker-sample-data-{}/tensorflow/model/resnet/resnet_50_v2_fp32_NCHW.tar.gz'.format(region)


# based on https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_using_elastic_inference_with_your_own_model/tensorflow_pretrained_model_elastic_inference.ipynb
@pytest.mark.skip_if_non_supported_ei_region
@pytest.mark.skip_if_no_accelerator
def test_deploy_elastic_inference_with_pretrained_model(pretrained_model_data, docker_image_uri, sagemaker_session, instance_type, accelerator_type):
resource_path = os.path.join(SCRIPT_PATH, '../resources')
endpoint_name = 'test-tf-ei-deploy-model-{}'.format(sagemaker_timestamp())

with timeout_and_delete_endpoint_by_name(endpoint_name=endpoint_name, sagemaker_session=sagemaker_session,
minutes=20):
tensorflow_model = TensorFlowModel(model_data=pretrained_model_data,
entry_point='default_entry_point.py',
source_dir=resource_path,
role='SageMakerRole',
image=docker_image_uri,
sagemaker_session=sagemaker_session)

logger.info('deploying model to endpoint: {}'.format(endpoint_name))
predictor = tensorflow_model.deploy(initial_instance_count=1,
instance_type=instance_type,
accelerator_type=accelerator_type,
endpoint_name=endpoint_name)

random_input = np.random.rand(1, 1, 3, 3)

predict_response = predictor.predict({'input': random_input.tolist()})
assert predict_response['outputs']['probabilities']
14 changes: 7 additions & 7 deletions test/integ/conftest.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

import logging
Expand Down Expand Up @@ -37,7 +37,7 @@ def pytest_addoption(parser):
parser.addoption('--tag', required=True)
parser.addoption('--region', default='us-west-2')
parser.addoption('--framework-version', required=True)
parser.addoption('--processor', required=True, choices=['gpu','cpu'])
parser.addoption('--processor', required=True, choices=['gpu', 'cpu'])


@pytest.fixture(scope='session')
Expand Down
1 change: 1 addition & 0 deletions test/resources/default_entry_point.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# use default SageMaker defined ``input_fn``, ``predict_fn``, and ``output_fn``