Skip to content

Commit ec2c3a4

Browse files
authored
Merge pull request aws#62 from awslabs/scikit_bring_your_own
Scikit bring your own Merging this in preparation for Monday's meeting.
2 parents d795f77 + c1968c4 commit ec2c3a4

File tree

20 files changed

+1567
-0
lines changed

20 files changed

+1567
-0
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Build an image that can do training and inference in SageMaker
2+
# This is a Python 2 image that uses the nginx, gunicorn, flask stack
3+
# for serving inferences in a stable way.
4+
5+
FROM ubuntu:16.04
6+
7+
MAINTAINER Amazon AI <[email protected]>
8+
9+
10+
RUN apt-get -y update && apt-get install -y --no-install-recommends \
11+
wget \
12+
python \
13+
nginx \
14+
ca-certificates \
15+
&& rm -rf /var/lib/apt/lists/*
16+
17+
# Here we get all python packages.
18+
# There's substantial overlap between scipy and numpy that we eliminate by
19+
# linking them together. Likewise, pip leaves the install caches populated which uses
20+
# a significant amount of space. These optimizations save a fair amount of space in the
21+
# image, which reduces start up time.
22+
RUN wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py && \
23+
pip install numpy scipy scikit-learn pandas flask gevent gunicorn && \
24+
(cd /usr/local/lib/python2.7/dist-packages/scipy/.libs; rm *; ln ../../numpy/.libs/* .) && \
25+
rm -rf /root/.cache
26+
27+
# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
28+
# output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
29+
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
30+
# PATH so that the train and serve programs are found when the container is invoked.
31+
32+
ENV PYTHONUNBUFFERED=TRUE
33+
ENV PYTHONDONTWRITEBYTECODE=TRUE
34+
ENV PATH="/opt/program:${PATH}"
35+
36+
# Make nginx log to stdout/err so that the log messages will be picked up by the
37+
# Docker logger
38+
RUN ln -s /dev/stdout /tmp/nginx.access.log && ln -s /dev/stderr /tmp/nginx.error.log
39+
40+
# Set up the program in the image
41+
COPY decision_trees /opt/program
42+
WORKDIR /opt/program
43+
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Bring-your-own Algorithm Sample
2+
3+
This example shows how to package an algorithm for use with IM. We have chosen a simple [scikit-learn][skl] implementation of decision trees to illustrate the procedure.
4+
5+
IM supports two execution modes: _training_ where the algorithm uses input data to train a new model and _serving_ where the algorithm accepts HTTP requests and uses the previously trained model to do an inference (also called "scoring", "prediction", or "transformation").
6+
7+
The algorithm that we have built here supports both training and scoring in IM with the same container image. It is perfectly reasonable to build an algorithm that supports only training _or_ scoring as well as to build an algorithm that has separate container images for training and scoring.v
8+
9+
In order to build a production grade inference server into the container, we use the following stack to make the implementer's job simple:
10+
11+
1. __[nginx][nginx]__ is a light-weight layer that handles the incoming HTTP requests and manages the I/O in and out of the container efficiently.
12+
2. __[gunicorn][gunicorn]__ is a WSGI pre-forking worker server that runs multiple copies of your application and load balances between them.
13+
3. __[flask][flask]__ is a simple web framework used in the inference app that you write. It lets you respond to call on the `/ping` and `/invocations` endpoints without having to write much code.
14+
15+
## The Structure of the Sample Code
16+
17+
The components are as follows:
18+
19+
* __Dockerfile__: The _Dockerfile_ describes how the image is built and what it contains. It is a recipe for your container and gives you tremendous flexibility to construct almost any execution environment you can imagine. Here. we use the Dockerfile to describe a pretty standard python science stack and the simple scripts that we're going to add to it. See the [Dockerfile reference][dockerfile] for what's possible here.
20+
21+
* __build\_and\_push.sh__: The script to build the Docker image (using the Dockerfile above) and push it to the [Amazon EC2 Container Registry (ECR)][ecr] so that it can be deployed to IM. Specify the name of the image as the argument to this script. The script will generate a full name for the repository in your account and your configured AWS region. If this ECR repository doesn't exist, the script will create it.
22+
23+
* __im-decision-trees__: The directory that contains the application to run in the container. See the next session for details about each of the files.
24+
25+
* __local-test__: A directory containing scripts and a setup for running a simple training and inference jobs locally so that you can test that everything is set up correctly. See below for details.
26+
27+
### The application run inside the container
28+
29+
When IM starts a container, it will invoke the container with an argument of either __train__ or __serve__. We have set this container up so that the argument in treated as the command that the container executes. When training, it will run the __train__ program included and, when serving, it will run the __serve__ program.
30+
31+
* __train__: The main program for training the model. When you build your own algorithm, you'll edit this to include your training code.
32+
* __serve__: The wrapper that starts the inference server. In most cases, you can use this file as-is.
33+
* __wsgi.py__: The start up shell for the individual server workers. This only needs to be changed if you changed where predictor.py is located or is named.
34+
* __predictor.py__: The algorithm-specific inference server. This is the file that you modify with your own algorithm's code.
35+
* __nginx.conf__: The configuration for the nginx master server that manages the multiple workers.
36+
37+
### Setup for local testing
38+
39+
The subdirectory local-test contains scripts and sample data for testing the built container image on the local machine. When building your own algorithm, you'll want to modify it appropriately.
40+
41+
* __train-local.sh__: Instantiate the container configured for training.
42+
* __serve-local.sh__: Instantiate the container configured for serving.
43+
* __predict.sh__: Run predictions against a locally instantiated server.
44+
* __test-dir__: The directory that gets mounted into the container with test data mounted in all the places that match the container schema.
45+
* __payload.csv__: Sample data for used by predict.sh for testing the server.
46+
47+
#### The directory tree mounted into the container
48+
49+
The tree under test-dir is mounted into the container and mimics the directory structure that IM would create for the running container during training or hosting.
50+
51+
* __input/config/hyperparameters.json__: The hyperparameters for the training job.
52+
* __input/data/training/leaf_train.csv__: The training data.
53+
* __model__: The directory where the algorithm writes the model file.
54+
* __output__: The directory where the algorithm can write its success or failure file.
55+
56+
## Environment variables
57+
58+
When you create an inference server, you can control some of Gunicorn's options via environment variables. These
59+
can be supplied as part of the CreateModel API call.
60+
61+
Parameter Environment Variable Default Value
62+
--------- -------------------- -------------
63+
number of workers MODEL_SERVER_WORKERS the number of CPU cores
64+
timeout MODEL_SERVER_TIMEOUT 60 seconds
65+
66+
67+
[skl]: http://scikit-learn.org "scikit-learn Home Page"
68+
[dockerfile]: https://docs.docker.com/engine/reference/builder/ "The official Dockerfile reference guide"
69+
[ecr]: https://aws.amazon.com/ecr/ "ECR Home Page"
70+
[nginx]: http://nginx.org/
71+
[gunicorn]: http://gunicorn.org/
72+
[flask]: http://flask.pocoo.org/
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/usr/bin/env bash
2+
3+
# This script shows how to build the Docker image and push it to ECR to be ready for use
4+
# by IM.
5+
6+
# The argument to this script is the image name. This will be used as the image on the local
7+
# machine and combined with the account and region to form the repository name for ECR.
8+
image=$1
9+
10+
if [ "$image" == "" ]
11+
then
12+
echo "Usage: $0 <image-name>"
13+
exit 1
14+
fi
15+
16+
# Get the account number associated with the current IAM credentials
17+
account=$(aws sts get-caller-identity --query Account --output text)
18+
19+
if [ $? -ne 0 ]
20+
then
21+
exit 255
22+
fi
23+
24+
25+
# Get the region defined in the current configuration (default to us-west-2 if none defined)
26+
region=$(aws configure get region)
27+
region=${region:-us-west-2}
28+
29+
30+
fullname="${account}.dkr.ecr.${region}.amazonaws.com/${image}:latest"
31+
32+
# If the repository doesn't exist in ECR, create it.
33+
34+
aws ecr describe-repositories --repository-names "${image}" > /dev/null 2>&1
35+
36+
if [ $? -ne 0 ]
37+
then
38+
aws ecr create-repository --repository-name "${image}" > /dev/null
39+
fi
40+
41+
# Get the login command from ECR and execute it directly
42+
$(aws ecr get-login --region ${region} --no-include-email)
43+
44+
# Build the docker image locally with the image name and then push it to ECR
45+
# with the full name.
46+
docker build -t ${image} .
47+
docker tag ${image} ${fullname}
48+
49+
docker push ${fullname}
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
worker_processes 1;
2+
daemon off; # Prevent forking
3+
4+
5+
pid /tmp/nginx.pid;
6+
error_log /tmp/nginx.error.log;
7+
8+
events {
9+
# defaults
10+
}
11+
12+
http {
13+
include /etc/nginx/mime.types;
14+
default_type application/octet-stream;
15+
access_log /tmp/nginx.access.log combined;
16+
17+
upstream gunicorn {
18+
server unix:/tmp/gunicorn.sock;
19+
}
20+
21+
server {
22+
listen 8080 deferred;
23+
client_max_body_size 5m;
24+
25+
keepalive_timeout 5;
26+
27+
location ~ ^/(ping|invocations) {
28+
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
29+
proxy_set_header Host $http_host;
30+
proxy_redirect off;
31+
proxy_pass http://gunicorn;
32+
}
33+
34+
location / {
35+
return 404 "{}";
36+
}
37+
}
38+
}
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# This is the file that implements a flask server to do inferences. It's the file that you will modify to
2+
# implement the scoring for your own algorithm.
3+
4+
from __future__ import print_function
5+
6+
import os
7+
import json
8+
import pickle
9+
import StringIO
10+
import sys
11+
import signal
12+
import traceback
13+
14+
import flask
15+
16+
import pandas as pd
17+
18+
prefix = '/opt/ml/'
19+
model_path = os.path.join(prefix, 'model')
20+
21+
# A singleton for holding the model. This simply loads the model and holds it.
22+
# It has a predict function that does a prediction based on the model and the input data.
23+
24+
class ScoringService(object):
25+
model = None # Where we keep the model when it's loaded
26+
27+
@classmethod
28+
def get_model(cls):
29+
"""Get the model object for this instance, loading it if it's not already loaded."""
30+
if cls.model == None:
31+
with open(os.path.join(model_path, 'decision-tree-model.pkl'), 'r') as inp:
32+
cls.model = pickle.load(inp)
33+
return cls.model
34+
35+
@classmethod
36+
def predict(cls, input):
37+
"""For the input, do the predictions and return them.
38+
39+
Args:
40+
input (a pandas dataframe): The data on which to do the predictions. There will be
41+
one prediction per row in the dataframe"""
42+
clf = cls.get_model()
43+
return clf.predict(input)
44+
45+
# The flask app for serving predictions
46+
app = flask.Flask(__name__)
47+
48+
@app.route('/ping', methods=['GET'])
49+
def ping():
50+
"""Determine if the container is working and healthy. In this sample container, we declare
51+
it healthy if we can load the model successfully."""
52+
health = ScoringService.get_model() is not None # You can insert a health check here
53+
54+
status = 200 if health else 404
55+
return flask.Response(response='\n', status=status, mimetype='application/json')
56+
57+
@app.route('/invocations', methods=['POST'])
58+
def transformation():
59+
"""Do an inference on a single batch of data. In this sample server, we take data as CSV, convert
60+
it to a pandas data frame for internal use and then convert the predictions back to CSV (which really
61+
just means one prediction per line, since there's a single column.
62+
"""
63+
data = None
64+
65+
# Convert from CSV to pandas
66+
if flask.request.content_type == 'text/csv':
67+
data = flask.request.data.decode('utf-8')
68+
s = StringIO.StringIO(data)
69+
data = pd.read_csv(s, header=None)
70+
else:
71+
return flask.Response(response='This predictor only supports CSV data', status=415, mimetype='text/plain')
72+
73+
print('Invoked with {} records'.format(data.shape[0]))
74+
75+
# Do the prediction
76+
predictions = ScoringService.predict(data)
77+
78+
# Convert from numpy back to CSV
79+
out = StringIO.StringIO()
80+
pd.DataFrame({'results':predictions}).to_csv(out, header=False, index=False)
81+
result = out.getvalue()
82+
83+
return flask.Response(response=result, status=200, mimetype='text/csv')
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
#!/usr/bin/env python
2+
3+
# This file implements the scoring service shell. You don't necessarily need to modify it for various
4+
# algorithms. It starts nginx and gunicorn with the correct configurations and then simply waits until
5+
# gunicorn exits.
6+
#
7+
# The flask server is specified to be the app object in wsgi.py
8+
#
9+
# We set the following parameters:
10+
#
11+
# Parameter Environment Variable Default Value
12+
# --------- -------------------- -------------
13+
# number of workers MODEL_SERVER_WORKERS the number of CPU cores
14+
# timeout MODEL_SERVER_TIMEOUT 60 seconds
15+
16+
from __future__ import print_function
17+
import multiprocessing
18+
import os
19+
import signal
20+
import subprocess
21+
import sys
22+
23+
cpu_count = multiprocessing.cpu_count()
24+
25+
model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT', 60)
26+
model_server_workers = int(os.environ.get('MODEL_SERVER_WORKERS', cpu_count))
27+
28+
def sigterm_handler(nginx_pid, gunicorn_pid):
29+
try:
30+
os.kill(nginx_pid, signal.SIGQUIT)
31+
except OSError:
32+
pass
33+
try:
34+
os.kill(gunicorn_pid, signal.SIGTERM)
35+
except OSError:
36+
pass
37+
38+
sys.exit(0)
39+
40+
def start_server():
41+
print('Starting the inference server with {} workers.'.format(model_server_workers))
42+
43+
44+
nginx = subprocess.Popen(['nginx', '-c', '/opt/program/nginx.conf'])
45+
gunicorn = subprocess.Popen(['gunicorn',
46+
'--timeout', str(model_server_timeout),
47+
'-k', 'gevent',
48+
'-b', 'unix:/tmp/gunicorn.sock',
49+
'-w', str(model_server_workers),
50+
'wsgi:app'])
51+
52+
signal.signal(signal.SIGTERM, lambda a, b: sigterm_handler(nginx.pid, gunicorn.pid))
53+
54+
# If either subprocess exits, so do we.
55+
pids = set([nginx.pid, gunicorn.pid])
56+
while True:
57+
pid, _ = os.wait()
58+
if pid in pids:
59+
break
60+
61+
sigterm_handler(nginx.pid, gunicorn.pid)
62+
print('Inference server exiting')
63+
64+
# The main routine just invokes the start function.
65+
66+
if __name__ == '__main__':
67+
start_server()

0 commit comments

Comments
 (0)