|
4 | 4 | "cell_type": "markdown",
|
5 | 5 | "metadata": {},
|
6 | 6 | "source": [
|
| 7 | +<<<<<<< HEAD |
7 | 8 | <<<<<<< HEAD
|
8 | 9 | "# Use Script Mode to train any TensorFlow script from GitHub in SageMaker\n",
|
9 | 10 | "\n",
|
|
12 | 13 | "For this example, you use [Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow](https://github.com/sherjilozair/char-rnn-tensorflow), but you can use the same technique for other scripts or repositories. For example, [TensorFlow Model Zoo](https://github.com/tensorflow/models) and [TensorFlow benchmark scripts](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks)."
|
13 | 14 | =======
|
14 | 15 | "# Using the Script Mode to train any TensorFlow script from GitHub in SageMaker\n",
|
| 16 | +======= |
| 17 | + "# Use Script Mode to train any TensorFlow script from GitHub in SageMaker\n", |
| 18 | +>>>>>>> Edited the tf script mode notebook (#90) |
15 | 19 | "\n",
|
16 |
| - "In this tutorial, we show how simple it is to train a TensorFlow script in SageMaker using the new Script Mode Tensorflow Container.\n", |
| 20 | + "In this tutorial, you train a TensorFlow script in SageMaker using the new Script Mode Tensorflow Container.\n", |
17 | 21 | "\n",
|
| 22 | +<<<<<<< HEAD |
18 | 23 | "The example we chose is [Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow](https://github.com/sherjilozair/char-rnn-tensorflow) but this same technique can be used to other scripts or repositories including [TensorFlow Model Zoo](https://github.com/tensorflow/models) and [TensorFlow benchmark scripts](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks)."
|
19 | 24 | >>>>>>> Add Script Mode example (#83)
|
| 25 | +======= |
| 26 | + "For this example, you use [Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow](https://github.com/sherjilozair/char-rnn-tensorflow), but you can use the same technique for other scripts or repositories. For example, [TensorFlow Model Zoo](https://github.com/tensorflow/models) and [TensorFlow benchmark scripts](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks)." |
| 27 | +>>>>>>> Edited the tf script mode notebook (#90) |
20 | 28 | ]
|
21 | 29 | },
|
22 | 30 | {
|
23 | 31 | "cell_type": "markdown",
|
24 | 32 | "metadata": {},
|
25 | 33 | "source": [
|
26 | 34 | <<<<<<< HEAD
|
| 35 | +<<<<<<< HEAD |
| 36 | +======= |
| 37 | +>>>>>>> Edited the tf script mode notebook (#90) |
27 | 38 | "## Set up the environment\n",
|
28 | 39 | "Let's start by creating a SageMaker session and specifying the following:\n",
|
29 | 40 | "- The S3 bucket and prefix to use for training and model data. The bucket should be in the same region as the Notebook Instance, training instance(s), and hosting instance(s). This example uses the default bucket that a SageMaker `Session` creates.\n",
|
30 | 41 | "- The IAM role that allows SageMaker services to access your data. For more information about using IAM roles in SageMaker, see [Amazon SageMaker Roles](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html).\n"
|
| 42 | +<<<<<<< HEAD |
31 | 43 | =======
|
32 | 44 | "## Setting up the environment\n",
|
33 | 45 | "Let's start by creating a SageMaker session and specifying:\n",
|
34 | 46 | "- The S3 bucket and prefix that you want to use for training and model data. It should be within the same region as the Notebook Instance, training, and hosting.\n",
|
35 | 47 | "- The IAM role allows SageMaker services to access your data. See the documentation [for how to create these](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html).\n"
|
36 | 48 | >>>>>>> Add Script Mode example (#83)
|
| 49 | +======= |
| 50 | +>>>>>>> Edited the tf script mode notebook (#90) |
37 | 51 | ]
|
38 | 52 | },
|
39 | 53 | {
|
|
55 | 69 | "cell_type": "markdown",
|
56 | 70 | "metadata": {},
|
57 | 71 | "source": [
|
| 72 | +<<<<<<< HEAD |
58 | 73 | <<<<<<< HEAD
|
59 | 74 | "### Clone the repository\n",
|
60 | 75 | "Run the following command to clone the repository that contains the example:"
|
61 | 76 | =======
|
62 | 77 | "### Clone the repository"
|
63 | 78 | >>>>>>> Add Script Mode example (#83)
|
| 79 | +======= |
| 80 | + "### Clone the repository\n", |
| 81 | + "Run the following command to clone the repository that contains the example:" |
| 82 | +>>>>>>> Edited the tf script mode notebook (#90) |
64 | 83 | ]
|
65 | 84 | },
|
66 | 85 | {
|
|
93 | 112 | "cell_type": "markdown",
|
94 | 113 | "metadata": {},
|
95 | 114 | "source": [
|
| 115 | +<<<<<<< HEAD |
96 | 116 | <<<<<<< HEAD
|
97 | 117 | "### Get the data\n",
|
98 | 118 | "For training data, use plain text versions of Sherlock Holmes stories."
|
99 | 119 | =======
|
100 | 120 | "### Getting the data"
|
101 | 121 | >>>>>>> Add Script Mode example (#83)
|
| 122 | +======= |
| 123 | + "### Get the data\n", |
| 124 | + "For training data, use plain text versions of Sherlock Holmes stories." |
| 125 | +>>>>>>> Edited the tf script mode notebook (#90) |
102 | 126 | ]
|
103 | 127 | },
|
104 | 128 | {
|
|
115 | 139 | "cell_type": "markdown",
|
116 | 140 | "metadata": {},
|
117 | 141 | "source": [
|
| 142 | +<<<<<<< HEAD |
118 | 143 | <<<<<<< HEAD
|
119 | 144 | "## Test locally"
|
120 | 145 | =======
|
121 | 146 | "## Testing locally"
|
122 | 147 | >>>>>>> Add Script Mode example (#83)
|
| 148 | +======= |
| 149 | + "## Test locally" |
| 150 | +>>>>>>> Edited the tf script mode notebook (#90) |
123 | 151 | ]
|
124 | 152 | },
|
125 | 153 | {
|
|
167 | 195 | "metadata": {},
|
168 | 196 | "source": [
|
169 | 197 | "\n",
|
| 198 | +<<<<<<< HEAD |
170 | 199 | <<<<<<< HEAD
|
171 | 200 | "Use [Local Mode](https://github.com/aws/sagemaker-python-sdk#local-mode) to run the script locally in the notebook instance before you run a SageMaker training job:"
|
172 | 201 | =======
|
173 | 202 | "We can use [Local Mode](https://github.com/aws/sagemaker-python-sdk#local-mode) to simulate SageMaker locally before submit training:"
|
174 | 203 | >>>>>>> Add Script Mode example (#83)
|
| 204 | +======= |
| 205 | + "Use [Local Mode](https://github.com/aws/sagemaker-python-sdk#local-mode) to run the script locally in the notebook instance before you run a SageMaker training job:" |
| 206 | +>>>>>>> Edited the tf script mode notebook (#90) |
175 | 207 | ]
|
176 | 208 | },
|
177 | 209 | {
|
|
188 | 220 | "\n",
|
189 | 221 | "estimator = ScriptModeTensorFlow(entry_point='train.py',\n",
|
190 | 222 | " source_dir='char-rnn-tensorflow',\n",
|
| 223 | +<<<<<<< HEAD |
191 | 224 | <<<<<<< HEAD
|
192 | 225 | " train_instance_type='local', # Run in local mode\n",
|
193 | 226 | =======
|
194 | 227 | " train_instance_type='local', \n",
|
195 | 228 | >>>>>>> Add Script Mode example (#83)
|
| 229 | +======= |
| 230 | + " train_instance_type='local', # Run in local mode\n", |
| 231 | +>>>>>>> Edited the tf script mode notebook (#90) |
196 | 232 | " train_instance_count=1,\n",
|
197 | 233 | " hyperparameters=hyperparameters,\n",
|
198 | 234 | " role=role)\n",
|
|
206 | 242 | "source": [
|
207 | 243 | "## How Script Mode executes the script in the container\n",
|
208 | 244 | "\n",
|
| 245 | +<<<<<<< HEAD |
209 | 246 | <<<<<<< HEAD
|
210 | 247 | "The above cell downloads a Python 3 CPU container locally and simulates a SageMaker training job. When training starts, script mode installs the user script as a Python module. The module name matches the script name. In this case, **train.py** is transformed into a Python module named **train**.\n",
|
211 | 248 | "\n",
|
|
215 | 252 | "\n",
|
216 | 253 | "After that, the Python interpreter executes the user module, passing **hyperparameters** as script arguments. The example above will be executed as follow:\n",
|
217 | 254 | >>>>>>> Add Script Mode example (#83)
|
| 255 | +======= |
| 256 | + "The above cell downloads a Python 3 CPU container locally and simulates a SageMaker training job. When training starts, script mode installs the user script as a Python module. The module name matches the script name. In this case, **train.py** is transformed into a Python module named **train**.\n", |
| 257 | + "\n", |
| 258 | + "After that, the Python interpreter executes the user module, passing **hyperparameters** as script arguments. The example above is executed as follows:\n", |
| 259 | +>>>>>>> Edited the tf script mode notebook (#90) |
218 | 260 | "```bash\n",
|
219 | 261 | "python -m train --num-epochs 1 --data-dir /opt/ml/input/data/training --save-dir /opt/ml/model\n",
|
220 | 262 | "```\n",
|
221 | 263 | "\n",
|
| 264 | +<<<<<<< HEAD |
222 | 265 | <<<<<<< HEAD
|
223 | 266 | "The **train** module consumes the hyperparameters using any argument parsing library. [The example we're using](https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/train.py#L11) uses the Python [argparse](https://docs.python.org/3/library/argparse.html) library:\n",
|
224 | 267 | =======
|
225 | 268 | "A user provide script consumes the hyperparameters using any argument parsing library, [in the example above](https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/train.py#L11):\n",
|
226 | 269 | >>>>>>> Add Script Mode example (#83)
|
| 270 | +======= |
| 271 | + "The **train** module consumes the hyperparameters using any argument parsing library. [The example we're using](https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/train.py#L11) uses the Python [argparse](https://docs.python.org/3/library/argparse.html) library:\n", |
| 272 | +>>>>>>> Edited the tf script mode notebook (#90) |
227 | 273 | "\n",
|
228 | 274 | "```python\n",
|
229 | 275 | "parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)\n",
|
|
238 | 284 | "\n",
|
239 | 285 | "Let's explain the values of **--data_dir** and **--save-dir**:\n",
|
240 | 286 | "\n",
|
| 287 | +<<<<<<< HEAD |
241 | 288 | <<<<<<< HEAD
|
242 | 289 | "- **/opt/ml/input/data/training** is the directory inside the container where the training data is downloaded. The data is downloaded to this folder because **training** is the channel name defined in ```estimator.fit({'training': inputs})```. See [training data](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html#your-algorithms-training-algo-running-container-trainingdata) for more information. \n",
|
243 | 290 | "\n",
|
|
251 | 298 | "For example, the example above can read information about the **training** channel provided in the training job request by adding the environment variable `SM_CHANNEL_TRAINING` as the default value for the `--data_dir` argument:\n",
|
252 | 299 | =======
|
253 | 300 | "- **/opt/ml/input/data/training** is the directory inside the container where the training data is downloaded. The data was downloaded in this folder because **training** is the channel name defined in ```estimator.fit({'training': inputs})```. See [training data](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html#your-algorithms-training-algo-running-container-trainingdata) for more information. \n",
|
| 301 | +======= |
| 302 | + "- **/opt/ml/input/data/training** is the directory inside the container where the training data is downloaded. The data is downloaded to this folder because **training** is the channel name defined in ```estimator.fit({'training': inputs})```. See [training data](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html#your-algorithms-training-algo-running-container-trainingdata) for more information. \n", |
| 303 | +>>>>>>> Edited the tf script mode notebook (#90) |
254 | 304 | "\n",
|
255 |
| - "- **/opt/ml/model** use this directory to save models, checkpoints or any other data. Any data saved in this folder is saved in the S3 bucket defined for training. See [model data](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html#your-algorithms-training-algo-envvariables) for more information.\n", |
| 305 | + "- **/opt/ml/model** use this directory to save models, checkpoints, or any other data. Any data saved in this folder is saved in the S3 bucket defined for training. See [model data](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html#your-algorithms-training-algo-envvariables) for more information.\n", |
256 | 306 | "\n",
|
257 | 307 | "### Reading additional information from the container\n",
|
258 | 308 | "\n",
|
259 |
| - "Very often, a user script needs additional information from the container that is not available in ```hyperparameters```.\n", |
260 |
| - "SageMaker Containers writes this information as **environment variables** that are available inside the script.\n", |
| 309 | + "Often, a user script needs additional information from the container that is not available in ```hyperparameters```.\n", |
| 310 | + "SageMaker containers write this information as **environment variables** that are available inside the script.\n", |
261 | 311 | "\n",
|
| 312 | +<<<<<<< HEAD |
262 | 313 | "For example, the example above can read information about the **training** channel provided in the training job request:\n",
|
263 | 314 | >>>>>>> Add Script Mode example (#83)
|
| 315 | +======= |
| 316 | + "For example, the example above can read information about the **training** channel provided in the training job request by adding the environment variable `SM_CHANNEL_TRAINING` as the default value for the `--data_dir` argument:\n", |
| 317 | +>>>>>>> Edited the tf script mode notebook (#90) |
264 | 318 | "\n",
|
265 | 319 | "```python\n",
|
266 | 320 | "if __name__ == '__main__':\n",
|
|
269 | 323 | " parser.add_argument('--data_dir', type=str, default=os.environ['SM_CHANNEL_TRAINING'])\n",
|
270 | 324 | "```\n",
|
271 | 325 | "\n",
|
| 326 | +<<<<<<< HEAD |
272 | 327 | <<<<<<< HEAD
|
273 | 328 | "Script mode displays the list of available environment variables in the training logs. You can find the [entire list here](https://github.com/aws/sagemaker-containers/blob/master/README.md#environment-variables-full-specification)."
|
274 | 329 | =======
|
275 | 330 | "Script Mode displays the list of the environment variables available in the training logs. You can find the [entire list here](https://github.com/aws/sagemaker-containers/blob/master/README.md#environment-variables-full-specification)."
|
276 | 331 | >>>>>>> Add Script Mode example (#83)
|
| 332 | +======= |
| 333 | + "Script mode displays the list of available environment variables in the training logs. You can find the [entire list here](https://github.com/aws/sagemaker-containers/blob/master/README.md#environment-variables-full-specification)." |
| 334 | +>>>>>>> Edited the tf script mode notebook (#90) |
277 | 335 | ]
|
278 | 336 | },
|
279 | 337 | {
|
|
287 | 345 | "cell_type": "markdown",
|
288 | 346 | "metadata": {},
|
289 | 347 | "source": [
|
| 348 | +<<<<<<< HEAD |
290 | 349 | <<<<<<< HEAD
|
291 | 350 | "After you test the training job locally, upload the dataset to an S3 bucket so SageMaker can access the data during training.\n"
|
292 | 351 | =======
|
293 | 352 | "We need to upload the dataset to an S3 bucket so SageMaker can access the data during training.\n"
|
294 | 353 | >>>>>>> Add Script Mode example (#83)
|
| 354 | +======= |
| 355 | + "After you test the training job locally, upload the dataset to an S3 bucket so SageMaker can access the data during training.\n" |
| 356 | +>>>>>>> Edited the tf script mode notebook (#90) |
295 | 357 | ]
|
296 | 358 | },
|
297 | 359 | {
|
|
307 | 369 | "cell_type": "markdown",
|
308 | 370 | "metadata": {},
|
309 | 371 | "source": [
|
| 372 | +<<<<<<< HEAD |
310 | 373 | <<<<<<< HEAD
|
311 | 374 | "To train in SageMaker, change the estimator argument **train_instance_type** to any SageMaker ml instance available for training. For example:"
|
312 | 375 | =======
|
313 | 376 | "You can change the estimator argument **train_instance_type** to any SageMaker ml instance available for training. For example:"
|
314 | 377 | >>>>>>> Add Script Mode example (#83)
|
| 378 | +======= |
| 379 | + "To train in SageMaker, change the estimator argument **train_instance_type** to any SageMaker ml instance available for training. For example:" |
| 380 | +>>>>>>> Edited the tf script mode notebook (#90) |
315 | 381 | ]
|
316 | 382 | },
|
317 | 383 | {
|
|
348 | 414 | "cell_type": "markdown",
|
349 | 415 | "metadata": {},
|
350 | 416 | "source": [
|
| 417 | +<<<<<<< HEAD |
351 | 418 | <<<<<<< HEAD
|
352 | 419 | "Script Mode installs the contents of your `source_dir` folder in the container as a [Python package](https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_modules.py#L100). You can include a [requirements.txt file in the root folder of your source_dir to install any pip dependencies](https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_modules.py#L111). You can, for example, install the lastest version of TensorFlow in the container:\n",
|
353 | 420 | =======
|
354 | 421 | "Script Mode will install your source_dir in the container as a [Python package](https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_modules.py#L100). You can include a [requirements.txt file in the root folder of your source_dir to install any pip dependencies](https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_modules.py#L111). You can, for example, install the lastest version of tensorflow in the container:\n",
|
355 | 422 | >>>>>>> Add Script Mode example (#83)
|
| 423 | +======= |
| 424 | + "Script Mode installs the contents of your `source_dir` folder in the container as a [Python package](https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_modules.py#L100). You can include a [requirements.txt file in the root folder of your source_dir to install any pip dependencies](https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_modules.py#L111). You can, for example, install the lastest version of TensorFlow in the container:\n", |
| 425 | +>>>>>>> Edited the tf script mode notebook (#90) |
356 | 426 | "\n",
|
357 | 427 | "content of requirements.txt\n",
|
358 | 428 | "```\n",
|
|
365 | 435 | "metadata": {},
|
366 | 436 | "source": [
|
367 | 437 | "# Installing apt-get packages and other dependencies\n",
|
| 438 | +<<<<<<< HEAD |
368 | 439 | <<<<<<< HEAD
|
369 | 440 | "You can define a `setup.py` file in your `source_dir` folder to install other dependencies. The example below installs [TensorFlow for C](https://www.tensorflow.org/install/lang_c) in the container."
|
370 | 441 | =======
|
371 | 442 | "You can define a setup.py file in your source_dir to install other dependencies. The example below will install [TensorFlow for C](https://www.tensorflow.org/install/lang_c) in the container."
|
372 | 443 | >>>>>>> Add Script Mode example (#83)
|
| 444 | +======= |
| 445 | + "You can define a `setup.py` file in your `source_dir` folder to install other dependencies. The example below installs [TensorFlow for C](https://www.tensorflow.org/install/lang_c) in the container." |
| 446 | +>>>>>>> Edited the tf script mode notebook (#90) |
373 | 447 | ]
|
374 | 448 | },
|
375 | 449 | {
|
|
473 | 547 | ],
|
474 | 548 | "metadata": {
|
475 | 549 | "kernelspec": {
|
| 550 | +<<<<<<< HEAD |
476 | 551 | <<<<<<< HEAD
|
477 | 552 | "display_name": "Python 3",
|
478 | 553 | "language": "python",
|
|
482 | 557 | "language": "python",
|
483 | 558 | "name": "python2"
|
484 | 559 | >>>>>>> Add Script Mode example (#83)
|
| 560 | +======= |
| 561 | + "display_name": "Python 3", |
| 562 | + "language": "python", |
| 563 | + "name": "python3" |
| 564 | +>>>>>>> Edited the tf script mode notebook (#90) |
485 | 565 | },
|
486 | 566 | "language_info": {
|
487 | 567 | "codemirror_mode": {
|
488 | 568 | "name": "ipython",
|
| 569 | +<<<<<<< HEAD |
489 | 570 | <<<<<<< HEAD
|
490 | 571 | "version": 3
|
491 | 572 | =======
|
492 | 573 | "version": 2
|
493 | 574 | >>>>>>> Add Script Mode example (#83)
|
| 575 | +======= |
| 576 | + "version": 3 |
| 577 | +>>>>>>> Edited the tf script mode notebook (#90) |
494 | 578 | },
|
495 | 579 | "file_extension": ".py",
|
496 | 580 | "mimetype": "text/x-python",
|
497 | 581 | "name": "python",
|
498 | 582 | "nbconvert_exporter": "python",
|
| 583 | +<<<<<<< HEAD |
499 | 584 | <<<<<<< HEAD
|
500 | 585 | "pygments_lexer": "ipython3",
|
501 | 586 | "version": "3.6.5"
|
502 | 587 | =======
|
503 | 588 | "pygments_lexer": "ipython2",
|
504 | 589 | "version": "2.7.15"
|
505 | 590 | >>>>>>> Add Script Mode example (#83)
|
| 591 | +======= |
| 592 | + "pygments_lexer": "ipython3", |
| 593 | + "version": "3.6.5" |
| 594 | +>>>>>>> Edited the tf script mode notebook (#90) |
506 | 595 | }
|
507 | 596 | },
|
508 | 597 | "nbformat": 4,
|
|
0 commit comments