Skip to content

Updated: local mode notebooks based on comments from first PR #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 6, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
"\n",
"### Pre-requisites\n",
"\n",
"This notebook shows how to use the SageMaker Python SDK to run your code in a local container before deploying to SageMaker's managed training or hosting environments. This can speed up iterative testing and debugging while using the same familiar Python SDK interface. Just change your estimator's `train_instance_type` to `local` (or `local_gpu` if you're using an ml.p2 or ml.p3 notebook instance).\n",
"This notebook shows how to use the SageMaker Python SDK to run your code in a local container before deploying to SageMaker's managed training or hosting environments. This can speed up iterative testing and debugging while using the same familiar Python SDK interface. Just change your estimator's `train_instance_type` to `local`. You could also use `local_gpu` if you're using an ml.p2 or ml.p3 notebook instance, but then you'll need to set `train_instance_count=1` since distributed, local, GPU training is not yet supported.\n",
"\n",
"In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU).\n",
"In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU). Running the setup.sh script below will handle this for you.\n",
"\n",
"**Note, you can only run a single local notebook at one time.**"
]
Expand All @@ -30,7 +30,7 @@
"source": [
"### Overview\n",
"\n",
"MNIST is a widely used dataset for handwritten digit classification. It consists of 70,000 labeled 28x28 pixel grayscale images of hand-written digits. The dataset is split into 60,000 training images and 10,000 test images. There are 10 classes (one for each of the 10 digits). This tutorial will show how to train and test an MNIST model on SageMaker using MXNet and the Gluon API."
"MNIST is a widely used dataset for handwritten digit classification. It consists of 70,000 labeled 28x28 pixel grayscale images of hand-written digits. The dataset is split into 60,000 training images and 10,000 test images. There are 10 classes (one for each of the 10 digits). This tutorial will show how to train and test an MNIST model on SageMaker local mode using MXNet and the Gluon API."
]
},
{
Expand Down Expand Up @@ -121,7 +121,7 @@
"source": [
"## Run the training script on SageMaker\n",
"\n",
"The ```MXNet``` class allows us to run our training function on SageMaker. We need to configure it with our training script, an IAM role, the number of training instances, and the training instance type. This is the the only difference from [mnist_with_gluon.ipynb](./mnist_with_gluon.ipynb). Instead of ``train_instance_type='ml.c4.xlarge'``, we set it to ``train_instance_type='local'``. For local training with GPU, we could set this to \"local_gpu\". In this case, `instance_type` was set above based on your whether you're running a GPU instance."
"The ```MXNet``` class allows us to run our training function on SageMaker local mode. We need to configure it with our training script, an IAM role, the number of training instances, and the training instance type. This is the the only difference from [mnist_with_gluon.ipynb](./mnist_with_gluon.ipynb). Instead of ``train_instance_type='ml.c4.xlarge'``, we set it to ``train_instance_type='local'``. For local training with GPU, we could set this to \"local_gpu\". In this case, `instance_type` was set above based on your whether you're running a GPU instance."
]
},
{
Expand All @@ -135,7 +135,7 @@
" train_instance_count=1, \n",
" train_instance_type=instance_type,\n",
" hyperparameters={'batch_size': 100, \n",
" 'epochs': 2, \n",
" 'epochs': 20, \n",
" 'learning_rate': 0.1, \n",
" 'momentum': 0.9, \n",
" 'log_interval': 100})"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
"\n",
"## Pre-requisites\n",
"\n",
"This notebook shows how to use the SageMaker Python SDK to run your code in a local container before deploying to SageMaker's managed training or hosting environments. This can speed up iterative testing and debugging while using the same familiar Python SDK interface. Just change your estimator's `train_instance_type` to `local` (or `local_gpu` if you're using an ml.p2 or ml.p3 notebook instance).\n",
"This notebook shows how to use the SageMaker Python SDK to run your code in a local container before deploying to SageMaker's managed training or hosting environments. This can speed up iterative testing and debugging while using the same familiar Python SDK interface. Just change your estimator's `train_instance_type` to `local`. You could also use `local_gpu` if you're using an ml.p2 or ml.p3 notebook instance, but then you'll need to set `train_instance_count=1` since distributed, local, GPU training is not yet supported.\n",
"\n",
"In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU).\n",
"In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU). Running the setup.sh script below will handle this for you.\n",
"\n",
"**Note, you can only run a single local notebook at one time.**"
]
Expand All @@ -30,9 +30,9 @@
"source": [
"## Overview\n",
"\n",
"The **SageMaker Python SDK** helps you deploy your models for training and hosting in optimized, productions ready containers in SageMaker. The SageMaker Python SDK is easy to use, modular, extensible and compatible with TensorFlow and MXNet. This tutorial focuses on how to create a convolutional neural network model to train the [MNIST dataset](http://yann.lecun.com/exdb/mnist/) using **TensorFlow in local mode**.\n",
"The **SageMaker Python SDK** helps you deploy your models for training and hosting in optimized, productions ready containers in SageMaker local mode. The SageMaker Python SDK is easy to use, modular, extensible and compatible with TensorFlow and MXNet. This tutorial focuses on how to create a convolutional neural network model to train the [MNIST dataset](http://yann.lecun.com/exdb/mnist/) using **TensorFlow in local mode**.\n",
"\n",
"### Set up the environment Set up the environment"
"### Set up the environment"
]
},
{
Expand All @@ -48,14 +48,6 @@
"\n",
"sagemaker_session = sagemaker.Session()\n",
"\n",
"instance_type = 'local'\n",
"\n",
"if subprocess.call('nvidia-smi') == 0:\n",
" ## Set type to GPU if one is present\n",
" instance_type = 'local_gpu'\n",
" \n",
"print(\"Instance type = \" + instance_type)\n",
"\n",
"role = get_execution_role()"
]
},
Expand Down Expand Up @@ -156,7 +148,7 @@
"source": [
"## Create a training job using the sagemaker.TensorFlow estimator\n",
"\n",
"The `TensorFlow` class allows us to run our training function on SageMaker. We need to configure it with our training script, an IAM role, the number of training instances, and the training instance type. Here is the the only difference from [tensorflow_distributed_mnist.ipynb](./tensorflow_distributed_mnist.ipynb) is that instead of ``train_instance_type='ml.c4.xlarge'``, we set it to ``train_instance_type='local'``. For local training with GPU, we could set this to \"local_gpu\". In this case, `instance_type` was set above based on your whether you're running a GPU instance.\n",
"The `TensorFlow` class allows us to run our training function on SageMaker. We need to configure it with our training script, an IAM role, the number of training instances, and the training instance type. Here is the the only difference from [tensorflow_distributed_mnist.ipynb](./tensorflow_distributed_mnist.ipynb) is that instead of ``train_instance_type='ml.c4.xlarge'``, we set it to ``train_instance_type='local'``. For local training with GPU, we could set this to `'local_gpu'` (but then need to set `train_instance_count=1`). In this case, `instance_type` was set above based on your whether you're running a GPU instance.\n",
"\n",
"After we've constructed our `TensorFlow` object, we fit it using the data we uploaded to S3. Even though we're in local mode, using S3 as our data source makes sense because it maintains consistency with how SageMaker's distributed, managed training ingests data."
]
Expand All @@ -175,8 +167,8 @@
" role=role,\n",
" training_steps=10, \n",
" evaluation_steps=10,\n",
" train_instance_count=1,\n",
" train_instance_type=instance_type)\n",
" train_instance_count=2,\n",
" train_instance_type='local')\n",
"\n",
"mnist_estimator.fit(inputs)"
]
Expand Down Expand Up @@ -208,7 +200,7 @@
"outputs": [],
"source": [
"mnist_predictor = mnist_estimator.deploy(initial_instance_count=1,\n",
" instance_type=instance_type)"
" instance_type='local')"
]
},
{
Expand Down