Skip to content

Commit d795f77

Browse files
authored
Merge pull request aws#28 from awslabs/byom-examples
BYOM Notebooks - Verbosity and other notes.
2 parents 912d76c + 6374c49 commit d795f77

File tree

3 files changed

+223
-43
lines changed

3 files changed

+223
-43
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ These examples provide more thorough mathematical treatment on a select group of
3939
- [Bring Your Own Model for k-means](advanced_functionality/kmeans_bring_your_own_model) shows how to take a model that's been fit elsewhere and use Amazon SageMaker Algorithms containers to host it.
4040
- [Bring Your Own Algorithm with R](advanced_functionality/r_bring_your_own) shows how to bring your own algorithm container to Amazon SageMaker using the R language.
4141
- [Bring Your Own Tensorflow Model](sagemaker-python-sdk/tensorflow_iris_byom) shows how to bring a model trained anywhere into Amazon SageMaker
42+
- [Bring Your Own MXNet Model](sagemaker-python-sdk/tensorflow_iris_byom) shows how to bring a model trained anywhere using MXNet into Amazon SageMaker
43+
- [Bring Your Own TensorFlow Model](sagemaker-python-sdk/tensorflow_iris_byom) shows how to bring a model trained anywhere using TensorFlow into Amazon SageMaker
4244

4345
## FAQ
4446

@@ -48,4 +50,4 @@ These examples provide more thorough mathematical treatment on a select group of
4850

4951
*How do I contribute my own example notebook?*
5052

51-
- Although we're extremely excited to receive contributions from the community, we're still working on the best mechanism to take in examples from and external source. Please bear with us in the short-term if pull requests take longer than expected or are closed.
53+
- Although we're extremely excited to receive contributions from the community, we're still working on the best mechanism to take in examples from and external source. Please bear with us in the short-term if pull requests take longer than expected or are closed.

sagemaker-python-sdk/mxnet_mnist_byom/mxnet_mnist.ipynb

Lines changed: 127 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,79 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Mxnet MNIST BYOM. Train locally and deploy on SageMaker."
7+
"# Mxnet BYOM: Train locally and deploy on SageMaker.\n",
8+
"\n",
9+
"1. [Introduction](#Introduction)\n",
10+
"2. [Prerequisites and Preprocessing](#Prequisites-and-Preprocessing)\n",
11+
" 1. [Permissions and environment variables](#Permissions-and-environment-variables)\n",
12+
" 2. [Data Setup](#Data-setup)\n",
13+
"3. [Training the network locally](#Training)\n",
14+
"4. [Set up hosting for the model](#Set-up-hosting-for-the-model)\n",
15+
" 1. [Export from MXNet](#Export-the-model-from-mxnet)\n",
16+
" 2. [Import model into SageMaker](#Import-model-into-SageMaker)\n",
17+
" 3. [Create endpoint](#Create-endpoint) \n",
18+
"5. [Validate the endpoint for use](#Validate-the-endpoint-for-use)\n",
19+
"\n",
20+
"\n",
21+
"__Note__: Compare this with the [tensorflow bring your own model example](../tensorflow_iris_byom/tensorflow_BYOM_iris.ipynb)"
22+
]
23+
},
24+
{
25+
"cell_type": "markdown",
26+
"metadata": {},
27+
"source": [
28+
"## Introduction\n",
29+
"In this notebook, we will train a neural network locally on the location from where this notebook is run using MXNet. We will then see how to create an endpoint from the trained MXNet model and deploy it on SageMaker. We will then inference from the newly created SageMaker endpoint. \n",
30+
"\n",
31+
"The neural network that we will use is a simple fully-connected neural network. The definition of the neural network can be found in the accompanying [mnist.py](mnist.py) file. The ``build_graph`` method contains the model defnition (shown below).\n",
32+
"\n",
33+
"```python\n",
34+
"def build_graph():\n",
35+
" data = mx.sym.var('data')\n",
36+
" data = mx.sym.flatten(data=data)\n",
37+
" fc1 = mx.sym.FullyConnected(data=data, num_hidden=128)\n",
38+
" act1 = mx.sym.Activation(data=fc1, act_type=\"relu\")\n",
39+
" fc2 = mx.sym.FullyConnected(data=act1, num_hidden=64)\n",
40+
" act2 = mx.sym.Activation(data=fc2, act_type=\"relu\")\n",
41+
" fc3 = mx.sym.FullyConnected(data=act2, num_hidden=10)\n",
42+
" return mx.sym.SoftmaxOutput(data=fc3, name='softmax')\n",
43+
"```\n",
44+
"\n",
45+
"From this definitnion we can see that there are two fully-connected layers of 128 and 64 neurons each. The activations of the last fully-connected layer is then fed into a Softmax layer of 10 neurons. We use 10 neurons here because the datatset on which we are going to predict is the MNIST dataset of hand-written digit recognition which has 10 classes. More details can be found about the dataset on the [creator's webpage](http://yann.lecun.com/exdb/mnist/)."
46+
]
47+
},
48+
{
49+
"cell_type": "markdown",
50+
"metadata": {},
51+
"source": [
52+
"## Prequisites and Preprocessing\n",
53+
"\n",
54+
"### Permissions and environment variables\n",
55+
"\n",
56+
"Here we set up the linkage and authentication to AWS services. In this notebook we only need the roles used to give learning and hosting access to your data. The Sagemaker SDK will use S3 defualt buckets when needed. Supply the role in the variable below."
57+
]
58+
},
59+
{
60+
"cell_type": "code",
61+
"execution_count": 2,
62+
"metadata": {
63+
"collapsed": true,
64+
"isConfigCell": true
65+
},
66+
"outputs": [],
67+
"source": [
68+
"import boto3, re\n",
69+
"assumed_role = boto3.client('sts').get_caller_identity()['Arn']\n",
70+
"role = re.sub(r'^(.+)sts::(\\d+):assumed-role/(.+?)/.*$', r'\\1iam::\\2:role/\\3', assumed_role)"
871
]
972
},
1073
{
1174
"cell_type": "markdown",
1275
"metadata": {},
1376
"source": [
14-
"In this notebook, we will train a model locally on the notebook instance and will deploy and predict from Sagemaker. This can easily be extended to a model trained anywhere else as well. All that is needed is the exported model file and the entry point file containing model definitions. \n",
77+
"### Data setup\n",
1578
"\n",
16-
"First, let us begin by downloading the mnist data using the mxnet utilities."
79+
"Next, we need to pull the data from the author's site to our local box. Since we have ``mxnet`` utilities, we will use the utilities to download the dataset locally."
1780
]
1881
},
1982
{
@@ -34,7 +97,34 @@
3497
"collapsed": true
3598
},
3699
"source": [
37-
"Train a typical mxnet model for lenet."
100+
"### Training\n",
101+
"\n",
102+
"It is time to train the network. Since we are training the network locally, we can make use of mxnet training tools. The training method is also in the accompanying [mnist.py](mnist.py) file. The method is shown below. \n",
103+
"\n",
104+
"```python \n",
105+
"def train(data, hyperparameters= {'learning_rate': 0.11}, num_cpus=0, num_gpus =1 , **kwargs):\n",
106+
" train_labels = data['train_label']\n",
107+
" train_images = data['train_data']\n",
108+
" test_labels = data['test_label']\n",
109+
" test_images = data['test_data']\n",
110+
" batch_size = 100\n",
111+
" train_iter = mx.io.NDArrayIter(train_images, train_labels, batch_size, shuffle=True)\n",
112+
" val_iter = mx.io.NDArrayIter(test_images, test_labels, batch_size)\n",
113+
" logging.getLogger().setLevel(logging.DEBUG)\n",
114+
" mlp_model = mx.mod.Module(\n",
115+
" symbol=build_graph(),\n",
116+
" context=get_train_context(num_cpus, num_gpus))\n",
117+
" mlp_model.fit(train_iter,\n",
118+
" eval_data=val_iter,\n",
119+
" optimizer='sgd',\n",
120+
" optimizer_params={'learning_rate': float(hyperparameters.get(\"learning_rate\", 0.1))},\n",
121+
" eval_metric='acc',\n",
122+
" batch_end_callback=mx.callback.Speedometer(batch_size, 100),\n",
123+
" num_epoch=10)\n",
124+
" return mlp_model\n",
125+
"```\n",
126+
"\n",
127+
"The method above collects the ``data`` variable that ``get_mnist`` method gives you (which is a dictionary of data arrays) along with a dictionary of ``hyperparameters`` which only contains learning rate, and other parameters. It creates a [``mxnet.mod.Module``](https://mxnet.incubator.apache.org/api/python/module.html) from the network graph we built in the ``build_graph`` method and trains the network using the ``mxnet.mod.Module.fit`` method. "
38128
]
39129
},
40130
{
@@ -53,7 +143,11 @@
53143
"cell_type": "markdown",
54144
"metadata": {},
55145
"source": [
56-
"Export the model and save it down. Analogous to the tensorflow example, some structure needs to be followed, which is explained in the following code."
146+
"## Set up hosting for the model\n",
147+
"\n",
148+
"### Export the model from mxnet\n",
149+
"\n",
150+
"In order to set up hosting, we have to import the model from training to hosting. We will begin by exporting the model from MXNet and saving it down. Analogous to the [TensorFlow example](../tensorflow_iris_byom/tensorflow_BYOM_iris.ipynb), some structure needs to be followed. The exported model has to be converted into a form that is readable by ``sagemaker.mxnet.model.MXNetModel``. The following code describes exporting the model in a form that does the same:"
57151
]
58152
},
59153
{
@@ -76,7 +170,9 @@
76170
"cell_type": "markdown",
77171
"metadata": {},
78172
"source": [
79-
"Open a sagemaker session and upload the model on to the default S3 bucket."
173+
"### Import model into SageMaker\n",
174+
"\n",
175+
"Open a new sagemaker session and upload the model on to the default S3 bucket. We can use the ``sagemaker.Session.upload_data`` method to do this. We need the location of where we exported the model from MXNet and where in our default bucket we want to store the model(``/model``). The default S3 bucket can be found using the ``sagemaker.Session.default_bucket`` method."
80176
]
81177
},
82178
{
@@ -97,7 +193,7 @@
97193
"cell_type": "markdown",
98194
"metadata": {},
99195
"source": [
100-
"Use the ``sagemaker.mxnet.model.MXNetModel`` to create a new model that can be deployed."
196+
"Use the ``sagemaker.mxnet.model.MXNetModel`` to import the model into SageMaker that can be deployed. We need the location of the S3 bucket where we have the model, the role for authentication and the entry_point where the model defintion is stored (``mnist.py``). The import call is the following:"
101197
]
102198
},
103199
{
@@ -110,15 +206,17 @@
110206
"source": [
111207
"from sagemaker.mxnet.model import MXNetModel\n",
112208
"sagemaker_model = MXNetModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',\n",
113-
" role = '<<set role here>>',\n",
209+
" role = role,\n",
114210
" entry_point = 'mnist.py')"
115211
]
116212
},
117213
{
118214
"cell_type": "markdown",
119215
"metadata": {},
120216
"source": [
121-
"Deploy the model"
217+
"### Create endpoint\n",
218+
"\n",
219+
"Now the model is ready to be deployed at a SageMaker endpoint. We can use the ``sagemaker.mxnet.model.MXNetModel.deploy`` method to do this. Unless you have created or prefer other instances, we recommend using 1 ``'ml.c4.xlarge'`` instance for this training. These are supplied as arguments. "
122220
]
123221
},
124222
{
@@ -137,7 +235,9 @@
137235
"cell_type": "markdown",
138236
"metadata": {},
139237
"source": [
140-
"We can now use this predictor to classify hand-written digits."
238+
"### Validate the endpoint for use\n",
239+
"\n",
240+
"We can now use this endpoint to classify hand-written digits."
141241
]
142242
},
143243
{
@@ -174,6 +274,13 @@
174274
"print(predictor.endpoint)"
175275
]
176276
},
277+
{
278+
"cell_type": "markdown",
279+
"metadata": {},
280+
"source": [
281+
"If you do not want continied use of the endpoint, you can remove it. Remember, open endpoints are charged. If this is a simple test or practice, it is recommended to delete them."
282+
]
283+
},
177284
{
178285
"cell_type": "code",
179286
"execution_count": null,
@@ -182,9 +289,14 @@
182289
},
183290
"outputs": [],
184291
"source": [
185-
"import sagemaker\n",
186-
"\n",
187-
"sagemaker.Session().delete_endpoint(predictor.endpoint)"
292+
"# sagemaker.Session().delete_endpoint(predictor.endpoint)"
293+
]
294+
},
295+
{
296+
"cell_type": "markdown",
297+
"metadata": {},
298+
"source": [
299+
"Clear all stored model data so that we don't overwrite them the next time. "
188300
]
189301
},
190302
{
@@ -204,9 +316,9 @@
204316
"metadata": {
205317
"notice": "Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.",
206318
"kernelspec": {
207-
"display_name": "Python 3",
319+
"display_name": "Environment (conda_mxnet_p36)",
208320
"language": "python",
209-
"name": "python3"
321+
"name": "conda_mxnet_p36"
210322
},
211323
"language_info": {
212324
"codemirror_mode": {

0 commit comments

Comments
 (0)