|
6 | 6 | "source": [
|
7 | 7 | "# Creating Estimators in tf.estimator with Keras\n",
|
8 | 8 | "\n",
|
9 |
| - "If you are getting started using SageMaker, check [iris-dnn-classifier-using-estimators/tutorial.ipynb](iris-dnn-classifier-using-estimators/tutorial.ipynb) first.\n", |
10 |
| - "\n", |
11 |
| - "This tutorial covers how to create your own `Estimator` using the building\n", |
12 |
| - "blocks provided in `tf.estimator`, which will predict the ages of\n", |
| 9 | + "This tutorial covers how to create your own training script using the building\n", |
| 10 | + "blocks provided in `tf.keras`, which will predict the ages of\n", |
13 | 11 | "[abalones](https://en.wikipedia.org/wiki/Abalone) based on their physical\n",
|
14 | 12 | "measurements. You'll learn how to do the following:\n",
|
15 | 13 | "\n",
|
|
54 | 52 | "cell_type": "markdown",
|
55 | 53 | "metadata": {},
|
56 | 54 | "source": [
|
57 |
| - "## Let's start by setting up the environment." |
| 55 | + "### Set up the environment¶" |
58 | 56 | ]
|
59 | 57 | },
|
60 | 58 | {
|
|
67 | 65 | "source": [
|
68 | 66 | "import os\n",
|
69 | 67 | "import sagemaker\n",
|
| 68 | + "from sagemaker import get_execution_role\n", |
70 | 69 | "\n",
|
71 | 70 | "sagemaker_session = sagemaker.Session()\n",
|
72 | 71 | "\n",
|
73 |
| - "# Replace with a role (either name or full arn) that gives SageMaker access to S3 and cloudwatch\n", |
74 |
| - "role='SageMakerRole'" |
| 72 | + "role = get_execution_role()" |
75 | 73 | ]
|
76 | 74 | },
|
77 | 75 | {
|
78 | 76 | "cell_type": "markdown",
|
79 | 77 | "metadata": {},
|
80 | 78 | "source": [
|
81 |
| - "## Uploading the data" |
| 79 | + "### Upload the data to a S3 bucket" |
82 | 80 | ]
|
83 | 81 | },
|
84 | 82 | {
|
|
96 | 94 | "cell_type": "markdown",
|
97 | 95 | "metadata": {},
|
98 | 96 | "source": [
|
99 |
| - "# Complete source code" |
| 97 | + "**sagemaker_session.upload_data** will upload the abalone dataset from your machine to a bucket named **sagemaker-{your aws account number}**, if you don't have this bucket yet, sagemaker_session will create it for you." |
| 98 | + ] |
| 99 | + }, |
| 100 | + { |
| 101 | + "cell_type": "markdown", |
| 102 | + "metadata": {}, |
| 103 | + "source": [ |
| 104 | + "## Complete source code\n", |
| 105 | + "Here is the full code for the network model:" |
100 | 106 | ]
|
101 | 107 | },
|
102 | 108 | {
|
|
337 | 343 | " # Calculate loss using mean squared error\n",
|
338 | 344 | " loss = tf.losses.mean_squared_error(labels, predictions)\n",
|
339 | 345 | " ...\n",
|
340 |
| - "```\n", |
341 |
| - "\n", |
342 |
| - "See the [tf.losses$API guide](https://www.tensorflow.org/api_docs/python/tf/losses) for a\n", |
343 |
| - "full list of loss functions and more details on supported arguments and usage.\n", |
344 |
| - "\n", |
| 346 | + "```" |
| 347 | + ] |
| 348 | + }, |
| 349 | + { |
| 350 | + "cell_type": "markdown", |
| 351 | + "metadata": {}, |
| 352 | + "source": [ |
345 | 353 | "Supplementary metrics for evaluation can be added to an `eval_metric_ops` dict.\n",
|
346 | 354 | "The following code defines an `rmse` metric, which calculates the root mean\n",
|
347 | 355 | "squared error for the model predictions. Note that the `labels` tensor is cast\n",
|
|
375 | 383 | " loss=loss, global_step=tf.train.get_global_step())\n",
|
376 | 384 | "```\n",
|
377 | 385 | "\n",
|
378 |
| - "For a full list of optimizers, and other details, see the\n", |
379 |
| - "@{$python/train#optimizers$API guide}.\n", |
380 |
| - "\n", |
381 | 386 | "### The complete abalone `model_fn`\n",
|
382 | 387 | "\n",
|
383 | 388 | "Here's the final, complete `model_fn` for the abalone age predictor. The\n",
|
|
435 | 440 | "cell_type": "markdown",
|
436 | 441 | "metadata": {},
|
437 | 442 | "source": [
|
438 |
| - "# Submitting script for training in SageMaker\n", |
439 |
| - "\n", |
| 443 | + "# Submitting script for training\n", |
440 | 444 | "\n",
|
441 | 445 | "We can use the SDK to run our local training script on SageMaker infrastructure.\n",
|
442 | 446 | "\n",
|
|
456 | 460 | "from sagemaker.tensorflow import TensorFlow\n",
|
457 | 461 | "\n",
|
458 | 462 | "abalone_estimator = TensorFlow(entry_point='abalone.py',\n",
|
459 |
| - " role=role,\n", |
460 |
| - " hyperparameters={'training_steps': 100, 'learning_rate': 0.001},\n", |
461 |
| - " train_instance_count=1,\n", |
462 |
| - " train_instance_type='ml.c4.xlarge')\n", |
| 463 | + " role=role,\n", |
| 464 | + " training_steps= 100, \n", |
| 465 | + " evaluation_steps= 100,\n", |
| 466 | + " hyperparameters={'learning_rate': 0.001},\n", |
| 467 | + " train_instance_count=1,\n", |
| 468 | + " train_instance_type='ml.c4.xlarge')\n", |
463 | 469 | "\n",
|
464 | 470 | "abalone_estimator.fit(inputs)"
|
465 | 471 | ]
|
|
470 | 476 | "source": [
|
471 | 477 | "`estimator.fit` will deploy a script in a container for training and returs the SageMaker model name using the following arguments:\n",
|
472 | 478 | "\n",
|
473 |
| - "* `framework=tensorflow`. Tells submit_training that it is a tensorflow container.\n", |
474 |
| - "* `script=\"abalone.py\"`. The relative path to the script that will be deployed to the container.\n", |
475 |
| - "* `data=\"s3://bucket_name/abalone-estimator-training\"`. The S3 location of the bucket that we uploaded earlier.\n", |
476 |
| - "* `role=\"maeve-pullrole\"`. AWS role that gives your account access to SageMaker training and hosting\n", |
477 |
| - "* `hyperparameters={'training_steps' : 100}`. Model and training hyperparameters. \n", |
| 479 | + "* **`entry_point=\"abalone.py\"`** The path to the script that will be deployed to the container.\n", |
| 480 | + "* **`training_steps=100`** The number of training steps of the training job.\n", |
| 481 | + "* **`evaluation_steps=100`** The number of evaluation steps of the training job.\n", |
| 482 | + "* **`role`**. AWS role that gives your account access to SageMaker training and hosting\n", |
| 483 | + "* **`hyperparameters={'learning_rate' : 0.001}`**. Training hyperparameters. \n", |
478 | 484 | "\n",
|
479 | 485 | "Running the code block above will do the following actions:\n",
|
480 | 486 | "* deploy your script in a container with tensorflow installed\n",
|
|
501 | 507 | },
|
502 | 508 | "outputs": [],
|
503 | 509 | "source": [
|
504 |
| - "abalone_predictor = abalone_estimator.deploy(initial_instance_count=1,\n", |
505 |
| - " instance_type='ml.c4.xlarge')" |
| 510 | + "abalone_predictor = abalone_estimator.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge')" |
506 | 511 | ]
|
507 | 512 | },
|
508 | 513 | {
|
|
0 commit comments