|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Gluon CIFAR-10 Trained in Local Mode\n", |
| 8 | + "_**ResNet model in Gluon trained locally in a notebook instance**_\n", |
| 9 | + "\n", |
| 10 | + "---\n", |
| 11 | + "\n", |
| 12 | + "---\n", |
| 13 | + "\n", |
| 14 | + "_This notebook was created and tested on an ml.p3.8xlarge notebook instance._\n", |
| 15 | + "\n", |
| 16 | + "## Setup\n", |
| 17 | + "\n", |
| 18 | + "Import libraries and set IAM role ARN." |
| 19 | + ] |
| 20 | + }, |
| 21 | + { |
| 22 | + "cell_type": "code", |
| 23 | + "execution_count": null, |
| 24 | + "metadata": {}, |
| 25 | + "outputs": [], |
| 26 | + "source": [ |
| 27 | + "import sagemaker\n", |
| 28 | + "from sagemaker.mxnet import MXNet\n", |
| 29 | + "\n", |
| 30 | + "sagemaker_session = sagemaker.Session()\n", |
| 31 | + "role = sagemaker.get_execution_role()" |
| 32 | + ] |
| 33 | + }, |
| 34 | + { |
| 35 | + "cell_type": "markdown", |
| 36 | + "metadata": {}, |
| 37 | + "source": [ |
| 38 | + "Install pre-requisites for local training." |
| 39 | + ] |
| 40 | + }, |
| 41 | + { |
| 42 | + "cell_type": "code", |
| 43 | + "execution_count": null, |
| 44 | + "metadata": {}, |
| 45 | + "outputs": [], |
| 46 | + "source": [ |
| 47 | + "!/bin/bash setup.sh" |
| 48 | + ] |
| 49 | + }, |
| 50 | + { |
| 51 | + "cell_type": "markdown", |
| 52 | + "metadata": {}, |
| 53 | + "source": [ |
| 54 | + "---\n", |
| 55 | + "\n", |
| 56 | + "## Data\n", |
| 57 | + "\n", |
| 58 | + "We use the helper scripts to download CIFAR-10 training data and sample images." |
| 59 | + ] |
| 60 | + }, |
| 61 | + { |
| 62 | + "cell_type": "code", |
| 63 | + "execution_count": null, |
| 64 | + "metadata": {}, |
| 65 | + "outputs": [], |
| 66 | + "source": [ |
| 67 | + "from cifar10_utils import download_training_data\n", |
| 68 | + "download_training_data()" |
| 69 | + ] |
| 70 | + }, |
| 71 | + { |
| 72 | + "cell_type": "markdown", |
| 73 | + "metadata": {}, |
| 74 | + "source": [ |
| 75 | + "We use the `sagemaker.Session.upload_data` function to upload our datasets to an S3 location. The return value `inputs` identifies the location -- we will use this later when we start the training job.\n", |
| 76 | + "\n", |
| 77 | + "Even though we are training within our notebook instance, we'll continue to use the S3 data location since it will allow us to easily transition to training in SageMaker's managed environment." |
| 78 | + ] |
| 79 | + }, |
| 80 | + { |
| 81 | + "cell_type": "code", |
| 82 | + "execution_count": null, |
| 83 | + "metadata": {}, |
| 84 | + "outputs": [], |
| 85 | + "source": [ |
| 86 | + "inputs = sagemaker_session.upload_data(path='data', key_prefix='data/DEMO-gluon-cifar10')\n", |
| 87 | + "print('input spec (in this case, just an S3 path): {}'.format(inputs))" |
| 88 | + ] |
| 89 | + }, |
| 90 | + { |
| 91 | + "cell_type": "markdown", |
| 92 | + "metadata": {}, |
| 93 | + "source": [ |
| 94 | + "---\n", |
| 95 | + "\n", |
| 96 | + "## Script\n", |
| 97 | + "\n", |
| 98 | + "We need to provide a training script that can run on the SageMaker platform. When SageMaker calls your function, it will pass in arguments that describe the training environment. Check the script below to see how this works.\n", |
| 99 | + "\n", |
| 100 | + "The network itself is a pre-built version contained in the [Gluon Model Zoo](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/model_zoo.html)." |
| 101 | + ] |
| 102 | + }, |
| 103 | + { |
| 104 | + "cell_type": "code", |
| 105 | + "execution_count": null, |
| 106 | + "metadata": {}, |
| 107 | + "outputs": [], |
| 108 | + "source": [ |
| 109 | + "!cat 'cifar10.py'" |
| 110 | + ] |
| 111 | + }, |
| 112 | + { |
| 113 | + "cell_type": "markdown", |
| 114 | + "metadata": {}, |
| 115 | + "source": [ |
| 116 | + "---\n", |
| 117 | + "\n", |
| 118 | + "## Train (Local Mode)\n", |
| 119 | + "\n", |
| 120 | + "The ```MXNet``` estimator will create our training job. To switch from training in SageMaker's managed environment to training within a notebook instance, just set `train_instance_type` to `local_gpu`." |
| 121 | + ] |
| 122 | + }, |
| 123 | + { |
| 124 | + "cell_type": "code", |
| 125 | + "execution_count": null, |
| 126 | + "metadata": {}, |
| 127 | + "outputs": [], |
| 128 | + "source": [ |
| 129 | + "m = MXNet('cifar10.py', \n", |
| 130 | + " role=role, \n", |
| 131 | + " train_instance_count=1, \n", |
| 132 | + " train_instance_type='local_gpu',\n", |
| 133 | + " hyperparameters={'batch_size': 1024, \n", |
| 134 | + " 'epochs': 50, \n", |
| 135 | + " 'learning_rate': 0.1, \n", |
| 136 | + " 'momentum': 0.9})" |
| 137 | + ] |
| 138 | + }, |
| 139 | + { |
| 140 | + "cell_type": "markdown", |
| 141 | + "metadata": {}, |
| 142 | + "source": [ |
| 143 | + "After we've constructed our `MXNet` object, we can fit it using the data we uploaded to S3. SageMaker makes sure our data is available in the local filesystem, so our training script can simply read the data from disk." |
| 144 | + ] |
| 145 | + }, |
| 146 | + { |
| 147 | + "cell_type": "code", |
| 148 | + "execution_count": null, |
| 149 | + "metadata": { |
| 150 | + "scrolled": true |
| 151 | + }, |
| 152 | + "outputs": [], |
| 153 | + "source": [ |
| 154 | + "m.fit(inputs)" |
| 155 | + ] |
| 156 | + }, |
| 157 | + { |
| 158 | + "cell_type": "markdown", |
| 159 | + "metadata": {}, |
| 160 | + "source": [ |
| 161 | + "---\n", |
| 162 | + "\n", |
| 163 | + "## Host\n", |
| 164 | + "\n", |
| 165 | + "After training, we use the MXNet estimator object to deploy an endpoint. Because we trained locally, we'll also deploy the endpoint locally. The predictor object returned by `deploy` lets us call the endpoint and perform inference on our sample images." |
| 166 | + ] |
| 167 | + }, |
| 168 | + { |
| 169 | + "cell_type": "code", |
| 170 | + "execution_count": null, |
| 171 | + "metadata": {}, |
| 172 | + "outputs": [], |
| 173 | + "source": [ |
| 174 | + "predictor = m.deploy(initial_instance_count=1, instance_type='local_gpu')" |
| 175 | + ] |
| 176 | + }, |
| 177 | + { |
| 178 | + "cell_type": "markdown", |
| 179 | + "metadata": {}, |
| 180 | + "source": [ |
| 181 | + "### Evaluate\n", |
| 182 | + "\n", |
| 183 | + "We'll use these CIFAR-10 sample images to test the service:\n", |
| 184 | + "\n", |
| 185 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/airplane1.png\" />\n", |
| 186 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/automobile1.png\" />\n", |
| 187 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/bird1.png\" />\n", |
| 188 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/cat1.png\" />\n", |
| 189 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/deer1.png\" />\n", |
| 190 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/dog1.png\" />\n", |
| 191 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/frog1.png\" />\n", |
| 192 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/horse1.png\" />\n", |
| 193 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/ship1.png\" />\n", |
| 194 | + "<img style=\"display: inline; height: 32px; margin: 0.25em\" src=\"images/truck1.png\" />\n", |
| 195 | + "\n" |
| 196 | + ] |
| 197 | + }, |
| 198 | + { |
| 199 | + "cell_type": "code", |
| 200 | + "execution_count": null, |
| 201 | + "metadata": {}, |
| 202 | + "outputs": [], |
| 203 | + "source": [ |
| 204 | + "# load the CIFAR-10 samples, and convert them into format we can use with the prediction endpoint\n", |
| 205 | + "from cifar10_utils import read_images\n", |
| 206 | + "\n", |
| 207 | + "filenames = ['images/airplane1.png',\n", |
| 208 | + " 'images/automobile1.png',\n", |
| 209 | + " 'images/bird1.png',\n", |
| 210 | + " 'images/cat1.png',\n", |
| 211 | + " 'images/deer1.png',\n", |
| 212 | + " 'images/dog1.png',\n", |
| 213 | + " 'images/frog1.png',\n", |
| 214 | + " 'images/horse1.png',\n", |
| 215 | + " 'images/ship1.png',\n", |
| 216 | + " 'images/truck1.png']\n", |
| 217 | + "\n", |
| 218 | + "image_data = read_images(filenames)" |
| 219 | + ] |
| 220 | + }, |
| 221 | + { |
| 222 | + "cell_type": "markdown", |
| 223 | + "metadata": {}, |
| 224 | + "source": [ |
| 225 | + "The predictor runs inference on our input data and returns the predicted class label (as a float value, so we convert to int for display)." |
| 226 | + ] |
| 227 | + }, |
| 228 | + { |
| 229 | + "cell_type": "code", |
| 230 | + "execution_count": null, |
| 231 | + "metadata": { |
| 232 | + "scrolled": true |
| 233 | + }, |
| 234 | + "outputs": [], |
| 235 | + "source": [ |
| 236 | + "for i, img in enumerate(image_data):\n", |
| 237 | + " response = predictor.predict(img)\n", |
| 238 | + " print('image {}: class: {}'.format(i, int(response)))" |
| 239 | + ] |
| 240 | + }, |
| 241 | + { |
| 242 | + "cell_type": "markdown", |
| 243 | + "metadata": {}, |
| 244 | + "source": [ |
| 245 | + "---\n", |
| 246 | + "\n", |
| 247 | + "## Cleanup\n", |
| 248 | + "\n", |
| 249 | + "After you have finished with this example, remember to delete the prediction endpoint. Only one local endpoint can be running at a time." |
| 250 | + ] |
| 251 | + }, |
| 252 | + { |
| 253 | + "cell_type": "code", |
| 254 | + "execution_count": null, |
| 255 | + "metadata": {}, |
| 256 | + "outputs": [], |
| 257 | + "source": [ |
| 258 | + "m.delete_endpoint()" |
| 259 | + ] |
| 260 | + } |
| 261 | + ], |
| 262 | + "metadata": { |
| 263 | + "kernelspec": { |
| 264 | + "display_name": "conda_mxnet_p27", |
| 265 | + "language": "python", |
| 266 | + "name": "conda_mxnet_p27" |
| 267 | + }, |
| 268 | + "language_info": { |
| 269 | + "codemirror_mode": { |
| 270 | + "name": "ipython", |
| 271 | + "version": 2 |
| 272 | + }, |
| 273 | + "file_extension": ".py", |
| 274 | + "mimetype": "text/x-python", |
| 275 | + "name": "python", |
| 276 | + "nbconvert_exporter": "python", |
| 277 | + "pygments_lexer": "ipython2", |
| 278 | + "version": "2.7.14" |
| 279 | + }, |
| 280 | + "notice": "Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License." |
| 281 | + }, |
| 282 | + "nbformat": 4, |
| 283 | + "nbformat_minor": 2 |
| 284 | +} |
0 commit comments