You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"This notebook shows how to use TensorBoard, and how the training job writes checkpoints to a external bucket.\n",
10
+
"The model used for this notebook is a RestNet model, trained with the CIFAR-10 dataset.\n",
11
+
"See the following papers for more background:\n",
12
+
"\n",
13
+
"[Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Dec 2015.\n",
14
+
"\n",
15
+
"[Identity Mappings in Deep Residual Networks](https://arxiv.org/pdf/1603.05027.pdf) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Jul 2016."
16
+
]
17
+
},
18
+
{
19
+
"cell_type": "markdown",
20
+
"metadata": {},
21
+
"source": [
22
+
"### Set up the environment"
23
+
]
24
+
},
25
+
{
26
+
"cell_type": "code",
27
+
"execution_count": null,
28
+
"metadata": {
29
+
"collapsed": true
30
+
},
31
+
"outputs": [],
32
+
"source": [
33
+
"import os\n",
34
+
"import sagemaker\n",
35
+
"import tensorflow\n",
36
+
"from sagemaker import get_execution_role\n",
37
+
"\n",
38
+
"sagemaker_session = sagemaker.Session()\n",
39
+
"\n",
40
+
"role = get_execution_role()"
41
+
]
42
+
},
43
+
{
44
+
"cell_type": "markdown",
45
+
"metadata": {},
46
+
"source": [
47
+
"### Download the CIFAR-10 dataset\n",
48
+
"Downloading the test and training data will take around 5 minutes."
"**sagemaker_session.upload_data** will upload the CIFAR-10 dataset from your machine to a bucket named **sagemaker-{*your aws account number*}**, if you don't have this bucket yet, sagemaker_session will create it for you."
"The **```fit```** method will create a training job named **```tensorboard-example-{unique identifier}```** in two **ml.c4.xlarge** instances. These instances will write checkpoints to the s3 bucket **```sagemaker-{your aws account number}```**.\n",
132
+
"\n",
133
+
"If you don't have this bucket yet, **```sagemaker_session```** will create it for you. These checkpoints can be used for restoring the training job, and to analyze training job metrics using **TensorBoard**. \n",
134
+
"\n",
135
+
"The parameter **```run_tensorboard_locally=True```** will run **TensorBoard** in the machine that this notebook is running. Everytime a new checkpoint is created by the training job in the S3 bucket, **```fit```** will download the checkpoint to the temp folder that **TensorBoard** is pointing to.\n",
136
+
"\n",
137
+
"When the **```fit```** method starts the training, it will log the port that **TensorBoard** is using to display the metrics. The default port is **6006**, but another port can be choosen depending on its availability. The port number will increase until finds an available port. After that the port number will printed in stdout.\n",
138
+
"\n",
139
+
"It takes a few minutes to provision containers and start the training job.**TensorBoard** will start to display metrics shortly after that.\n",
140
+
"\n",
141
+
"You can access **Tensorboard** locally at [http://localhost:6006](http://localhost:6006) or using your SakeMaker workspace [proxy/6006](/proxy/6006). If TensorBoard started on a different port, adjust these URLs to match."
142
+
]
143
+
},
144
+
{
145
+
"cell_type": "markdown",
146
+
"metadata": {
147
+
"collapsed": true
148
+
},
149
+
"source": [
150
+
"# Deploy the trained model to prepare for predictions\n",
151
+
"\n",
152
+
"The deploy() method creates an endpoint which serves prediction requests in real-time."
"To avoid incurring charges to your AWS account for the resources used in this tutorial you need to delete the **SageMaker Endpoint:**"
170
+
]
171
+
},
172
+
{
173
+
"cell_type": "code",
174
+
"execution_count": null,
175
+
"metadata": {
176
+
"collapsed": true
177
+
},
178
+
"outputs": [],
179
+
"source": [
180
+
"estimator.delete_endpoint()"
181
+
]
182
+
}
183
+
],
184
+
"metadata": {
185
+
"notice": "Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.",
0 commit comments