Skip to content

Commit 9b76f52

Browse files
authored
RLEstimator documentaion (#514)
1 parent db1876d commit 9b76f52

File tree

2 files changed

+355
-11
lines changed

2 files changed

+355
-11
lines changed

README.rst

Lines changed: 29 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,17 +32,18 @@ Table of Contents
3232
4. `TensorFlow SageMaker Estimators <#tensorflow-sagemaker-estimators>`__
3333
5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators>`__
3434
6. `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__
35-
7. `SageMaker SparkML Serving <#sagemaker-sparkml-serving>`__
36-
8. `AWS SageMaker Estimators <#aws-sagemaker-estimators>`__
37-
9. `Using SageMaker AlgorithmEstimators <#using-sagemaker-algorithmestimators>`__
38-
10. `Consuming SageMaker Model Packages <#consuming-sagemaker-model-packages>`__
39-
11. `BYO Docker Containers with SageMaker Estimators <#byo-docker-containers-with-sagemaker-estimators>`__
40-
12. `SageMaker Automatic Model Tuning <#sagemaker-automatic-model-tuning>`__
41-
13. `SageMaker Batch Transform <#sagemaker-batch-transform>`__
42-
14. `Secure Training and Inference with VPC <#secure-training-and-inference-with-vpc>`__
43-
15. `BYO Model <#byo-model>`__
44-
16. `Inference Pipelines <#inference-pipelines>`__
45-
17. `SageMaker Workflow <#sagemaker-workflow>`__
35+
7. `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__
36+
8. `SageMaker SparkML Serving <#sagemaker-sparkml-serving>`__
37+
9. `AWS SageMaker Estimators <#aws-sagemaker-estimators>`__
38+
10. `Using SageMaker AlgorithmEstimators <#using-sagemaker-algorithmestimators>`__
39+
11. `Consuming SageMaker Model Packages <#consuming-sagemaker-model-packages>`__
40+
12. `BYO Docker Containers with SageMaker Estimators <#byo-docker-containers-with-sagemaker-estimators>`__
41+
13. `SageMaker Automatic Model Tuning <#sagemaker-automatic-model-tuning>`__
42+
14. `SageMaker Batch Transform <#sagemaker-batch-transform>`__
43+
15. `Secure Training and Inference with VPC <#secure-training-and-inference-with-vpc>`__
44+
16. `BYO Model <#byo-model>`__
45+
17. `Inference Pipelines <#inference-pipelines>`__
46+
18. `SageMaker Workflow <#sagemaker-workflow>`__
4647

4748

4849
Installing the SageMaker Python SDK
@@ -143,6 +144,7 @@ The following sections of this document explain how to use the different estimat
143144
* `TensorFlow SageMaker Estimators and Models <#tensorflow-sagemaker-estimators>`__
144145
* `Chainer SageMaker Estimators and Models <#chainer-sagemaker-estimators>`__
145146
* `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__
147+
* `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__
146148
* `AWS SageMaker Estimators and Models <#aws-sagemaker-estimators>`__
147149
* `Custom SageMaker Estimators and Models <#byo-docker-containers-with-sagemaker-estimators>`__
148150

@@ -412,6 +414,22 @@ For more information about PyTorch SageMaker ``Estimators``, see `PyTorch SageMa
412414
.. _PyTorch SageMaker Estimators and Models: src/sagemaker/pytorch/README.rst
413415
414416
417+
SageMaker Reinforcement Learning Estimators
418+
-------------------------------------------
419+
420+
With Reinforcement Learning (RL) Estimators, you can use reinforcement learning to train models on Amazon SageMaker.
421+
422+
Supported versions of Coach: ``0.10.1`` with TensorFlow, ``0.11.0`` with TensorFlow or MXNet.
423+
For more information about Coach, see https://github.com/NervanaSystems/coach
424+
425+
Supported versions of Ray: ``0.5.3`` with TensorFlow.
426+
For more information about Ray, see https://github.com/ray-project/ray
427+
428+
For more information about SageMaker RL ``Estimators``, see `SageMaker Reinforcement Learning Estimators`_.
429+
430+
.. _SageMaker Reinforcement Learning Estimators: src/sagemaker/rl/README.rst
431+
432+
415433
SageMaker SparkML Serving
416434
-------------------------
417435

src/sagemaker/rl/README.rst

Lines changed: 326 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,326 @@
1+
===========================================
2+
SageMaker Reinforcement Learning Estimators
3+
===========================================
4+
5+
With Reinforcement Learning (RL) Estimators, you can train reinforcement learning models on Amazon SageMaker.
6+
7+
Supported versions of Coach: ``0.10.1`` with TensorFlow, ``0.11.0`` with TensorFlow or MXNet.
8+
For more information about Coach, see https://github.com/NervanaSystems/coach
9+
10+
Supported versions of Ray: ``0.5.3`` with TensorFlow.
11+
For more information about Ray, see https://github.com/ray-project/ray
12+
13+
Table of Contents
14+
-----------------
15+
16+
1. `RL Training <#rl-training>`__
17+
2. `RL Estimators <#rl-estimators>`__
18+
3. `Distributed RL Training <#distributed-rl-training>`__
19+
4. `Saving models <#saving-models>`__
20+
5. `Deploying RL Models <#deploying-rl-models>`__
21+
6. `RL Training Examples <#rl-training-examples>`__
22+
7. `SageMaker RL Docker Containers <#sagemaker-rl-docker-containers>`__
23+
24+
25+
RL Training
26+
-----------
27+
28+
Training RL models using ``RLEstimator`` is a two-step process:
29+
30+
1. Prepare a training script to run on SageMaker
31+
2. Run this script on SageMaker via an ``RlEstimator``.
32+
33+
You should prepare your script in a separate source file than the notebook, terminal session, or source file you're
34+
using to submit the script to SageMaker via an ``RlEstimator``. This will be discussed in further detail below.
35+
36+
Suppose that you already have a training script called ``coach-train.py``.
37+
You can then create an ``RLEstimator`` with keyword arguments to point to this script and define how SageMaker runs it:
38+
39+
.. code:: python
40+
41+
from sagemaker.rl import RLEstimator, RLToolkit, RLFramework
42+
43+
rl_estimator = RLEstimator(entry_point='coach-train.py',
44+
toolkit=RLToolkit.COACH,
45+
toolkit_version='0.11.0',
46+
framework=RLFramework.TENSORFLOW,
47+
role='SageMakerRole',
48+
train_instance_type='ml.p3.2xlarge',
49+
train_instance_count=1)
50+
51+
After that, you simply tell the estimator to start a training job:
52+
53+
.. code:: python
54+
55+
rl_estimator.fit()
56+
57+
In the following sections, we'll discuss how to prepare a training script for execution on SageMaker
58+
and how to run that script on SageMaker using ``RLEstimator``.
59+
60+
61+
Preparing the RL Training Script
62+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63+
64+
Your RL training script must be a Python 3.5 compatible source file from MXNet framework or Python 3.6 for TensorFlow.
65+
66+
The training script is very similar to a training script you might run outside of SageMaker, but you
67+
can access useful properties about the training environment through various environment variables, such as
68+
69+
* ``SM_MODEL_DIR``: A string representing the path to the directory to write model artifacts to.
70+
These artifacts are uploaded to S3 for model hosting.
71+
* ``SM_NUM_GPUS``: An integer representing the number of GPUs available to the host.
72+
* ``SM_OUTPUT_DATA_DIR``: A string representing the filesystem path to write output artifacts to. Output artifacts may
73+
include checkpoints, graphs, and other files to save, not including model artifacts. These artifacts are compressed
74+
and uploaded to S3 to the same S3 prefix as the model artifacts.
75+
76+
For the exhaustive list of available environment variables, see the
77+
`SageMaker Containers documentation <https://github.com/aws/sagemaker-containers#list-of-provided-environment-variables-by-sagemaker-containers>`__.
78+
79+
80+
RL Estimators
81+
-------------
82+
83+
The ``RLEstimator`` constructor takes both required and optional arguments.
84+
85+
Required arguments
86+
~~~~~~~~~~~~~~~~~~
87+
88+
The following are required arguments to the ``RLEstimator`` constructor. When you create an instance of RLEstimator, you must include
89+
these in the constructor, either positionally or as keyword arguments.
90+
91+
- ``entry_point`` Path (absolute or relative) to the Python file which
92+
should be executed as the entry point to training.
93+
- ``role`` An AWS IAM role (either name or full ARN). The Amazon
94+
SageMaker training jobs and APIs that create Amazon SageMaker
95+
endpoints use this role to access training data and model artifacts.
96+
After the endpoint is created, the inference code might use the IAM
97+
role, if accessing AWS resource.
98+
- ``train_instance_count`` Number of Amazon EC2 instances to use for
99+
training.
100+
- ``train_instance_type`` Type of EC2 instance to use for training, for
101+
example, 'ml.m4.xlarge'.
102+
103+
You must as well include either:
104+
105+
- ``toolkit`` RL toolkit (Ray RLlib or Coach) you want to use for executing your model training code.
106+
107+
- ``toolkit_version`` RL toolkit version you want to be use for executing your model training code.
108+
109+
- ``framework`` Framework (MXNet or TensorFlow) you want to be used as
110+
a toolkit backed for reinforcement learning training.
111+
112+
or provide:
113+
114+
- ``image_name`` An alternative docker image to use for training and
115+
serving. If specified, the estimator will use this image for training and
116+
hosting, instead of selecting the appropriate SageMaker official image based on
117+
framework_version and py_version. Refer to: `SageMaker RL Docker Containers
118+
<#sagemaker-rl-docker-containers>`_ for details on what the Official images support
119+
and where to find the source code to build your custom image.
120+
121+
122+
Optional arguments
123+
~~~~~~~~~~~~~~~~~~
124+
125+
The following are optional arguments. When you create an ``RlEstimator`` object, you can specify these as keyword arguments.
126+
127+
- ``source_dir`` Path (absolute or relative) to a directory with any
128+
other training source code dependencies including the entry point
129+
file. Structure within this directory will be preserved when training
130+
on SageMaker.
131+
- ``dependencies (list[str])`` A list of paths to directories (absolute or relative) with
132+
any additional libraries that will be exported to the container (default: ``[]``).
133+
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
134+
If the ``source_dir`` points to S3, code will be uploaded and the S3 location will be used
135+
instead. Example:
136+
137+
The following call
138+
>>> RLEstimator(entry_point='train.py',
139+
toolkit=RLToolkit.COACH,
140+
toolkit_version='0.11.0',
141+
framework=RLFramework.TENSORFLOW,
142+
dependencies=['my/libs/common', 'virtual-env'])
143+
results in the following inside the container:
144+
145+
>>> $ ls
146+
147+
>>> opt/ml/code
148+
>>> ├── train.py
149+
>>> ├── common
150+
>>> └── virtual-env
151+
152+
- ``hyperparameters`` Hyperparameters that will be used for training.
153+
Will be made accessible as a ``dict[str, str]`` to the training code on
154+
SageMaker. For convenience, accepts other types besides strings, but
155+
``str`` will be called on keys and values to convert them before
156+
training.
157+
- ``train_volume_size`` Size in GB of the EBS volume to use for storing
158+
input data during training. Must be large enough to store training
159+
data if ``input_mode='File'`` is used (which is the default).
160+
- ``train_max_run`` Timeout in seconds for training, after which Amazon
161+
SageMaker terminates the job regardless of its current status.
162+
- ``input_mode`` The input mode that the algorithm supports. Valid
163+
modes: 'File' - Amazon SageMaker copies the training dataset from the
164+
S3 location to a directory in the Docker container. 'Pipe' - Amazon
165+
SageMaker streams data directly from S3 to the container via a Unix
166+
named pipe.
167+
- ``output_path`` S3 location where you want the training result (model
168+
artifacts and optional output files) saved. If not specified, results
169+
are stored to a default bucket. If the bucket with the specific name
170+
does not exist, the estimator creates the bucket during the ``fit``
171+
method execution.
172+
- ``output_kms_key`` Optional KMS key ID to optionally encrypt training
173+
output with.
174+
- ``job_name`` Name to assign for the training job that the ``fit```
175+
method launches. If not specified, the estimator generates a default
176+
job name, based on the training image name and current timestamp
177+
178+
Calling fit
179+
~~~~~~~~~~~
180+
181+
You start your training script by calling ``fit`` on an ``RLEstimator``. ``fit`` takes both a few optional
182+
arguments.
183+
184+
Optional arguments
185+
''''''''''''''''''
186+
187+
- ``inputs``: This can take one of the following forms: A string
188+
S3 URI, for example ``s3://my-bucket/my-training-data``. In this
189+
case, the S3 objects rooted at the ``my-training-data`` prefix will
190+
be available in the default ``train`` channel. A dict from
191+
string channel names to S3 URIs. In this case, the objects rooted at
192+
each S3 prefix will available as files in each channel directory.
193+
- ``wait``: Defaults to True, whether to block and wait for the
194+
training script to complete before returning.
195+
- ``logs``: Defaults to True, whether to show logs produced by training
196+
job in the Python session. Only meaningful when wait is True.
197+
198+
199+
Distributed RL Training
200+
-----------------------
201+
202+
Amazon SageMaker RL supports multi-core and multi-instance distributed training.
203+
Depending on your use case, training and/or environment rollout can be distributed.
204+
205+
Please see the `Amazon SageMaker examples <https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning>`_
206+
on how it can be done using different RL toolkits.
207+
208+
209+
Saving models
210+
-------------
211+
212+
In order to save your trained PyTorch model for deployment on SageMaker, your training script should save your model
213+
to a certain filesystem path ``/opt/ml/model``. This value is also accessible through the environment variable
214+
``SM_MODEL_DIR``.
215+
216+
Deploying RL Models
217+
-------------------
218+
219+
After an RL Estimator has been fit, you can host the newly created model in SageMaker.
220+
221+
After calling ``fit``, you can call ``deploy`` on an ``RlEstimator`` Estimator to create a SageMaker Endpoint.
222+
The Endpoint runs one of the SageMaker-provided model server based on the ``framework`` parameter
223+
specified in the ``RLEstimator`` constructor and hosts the model produced by your training script,
224+
which was run when you called ``fit``. This was the model you saved to ``model_dir``.
225+
In case if ``image_name`` was specified it would use provided image for the deployment.
226+
227+
``deploy`` returns a ``sagemaker.mxnet.MXNetPredictor`` for MXNet or
228+
``sagemaker.tensorflow.serving.Predictor`` for TensorFlow.
229+
230+
``predict`` returns the result of inference against your model.
231+
232+
.. code:: python
233+
234+
# Train my estimator
235+
rl_estimator = RLEstimator(entry_point='coach-train.py',
236+
toolkit=RLToolkit.COACH,
237+
toolkit_version='0.11.0',
238+
framework=RLFramework.MXNET,
239+
role='SageMakerRole',
240+
train_instance_type='ml.c4.2xlarge',
241+
train_instance_count=1)
242+
243+
rl_estimator.fit()
244+
245+
# Deploy my estimator to a SageMaker Endpoint and get a MXNetPredictor
246+
predictor = rl_estimator.deploy(instance_type='ml.m4.xlarge',
247+
initial_instance_count=1)
248+
249+
response = predictor.predict(data)
250+
251+
For more information please see `The SageMaker MXNet Model Server <https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/mxnet#the-sagemaker-mxnet-model-server>`_
252+
and `Deploying to TensorFlow Serving Endpoints <https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst>`_ documentation.
253+
254+
255+
Working with Existing Training Jobs
256+
-----------------------------------
257+
258+
Attaching to existing training jobs
259+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260+
261+
You can attach an RL Estimator to an existing training job using the
262+
``attach`` method.
263+
264+
.. code:: python
265+
266+
my_training_job_name = 'MyAwesomeRLTrainingJob'
267+
rl_estimator = RLEstimator.attach(my_training_job_name)
268+
269+
After attaching, if the training job has finished with job status "Completed", it can be
270+
``deploy``\ ed to create a SageMaker Endpoint and return a ``Predictor``. If the training job is in progress,
271+
attach will block and display log messages from the training job, until the training job completes.
272+
273+
The ``attach`` method accepts the following arguments:
274+
275+
- ``training_job_name:`` The name of the training job to attach
276+
to.
277+
- ``sagemaker_session:`` The Session used
278+
to interact with SageMaker
279+
280+
RL Training Examples
281+
--------------------
282+
283+
Amazon provides several example Jupyter notebooks that demonstrate end-to-end training on Amazon SageMaker using RL.
284+
Please refer to:
285+
286+
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning
287+
288+
These are also available in SageMaker Notebook Instance hosted Jupyter notebooks under the sample notebooks folder.
289+
290+
291+
SageMaker RL Docker Containers
292+
------------------------------
293+
294+
When training and deploying training scripts, SageMaker runs your Python script in a Docker container with several
295+
libraries installed. When creating the Estimator and calling deploy to create the SageMaker Endpoint, you can control
296+
the environment your script runs in.
297+
298+
SageMaker runs RL Estimator scripts in either Python 3.5 for MXNet or Python 3.6 for TensorFlow.
299+
300+
The Docker images have the following dependencies installed:
301+
302+
+-------------------------+-------------------+-------------------+-------------------+
303+
| Dependencies | Coach 0.10.1 | Coach 0.11.0 | Ray 0.5.3 |
304+
+-------------------------+-------------------+-------------------+-------------------+
305+
| Python | 3.6 | 3.5(MXNet) or | 3.6 |
306+
| | | 3.6(TensorFlow) | |
307+
+-------------------------+-------------------+-------------------+-------------------+
308+
| CUDA (GPU image only) | 9.0 | 9.0 | 9.0 |
309+
+-------------------------+-------------------+-------------------+-------------------+
310+
| DL Framework | TensorFlow-1.11.0 | MXNet-1.3.0 or | TensorFlow-1.11.0 |
311+
| | | TensorFlow-1.11.0 | |
312+
+-------------------------+-------------------+-------------------+-------------------+
313+
| gym | 0.10.5 | 0.10.5 | 0.10.5 |
314+
+-------------------------+-------------------+-------------------+-------------------+
315+
316+
The Docker images extend Ubuntu 16.04.
317+
318+
You can select version of by passing a ``framework_version`` keyword arg to the RL Estimator constructor.
319+
Currently supported versions are listed in the above table. You can also set ``framework_version`` to only specify major and
320+
minor version, which will cause your training script to be run on the latest supported patch version of that minor
321+
version.
322+
323+
Alternatively, you can build your own image by following the instructions in the SageMaker RL containers
324+
repository, and passing ``image_name`` to the RL Estimator constructor.
325+
326+
You can visit `the SageMaker RL containers repository <https://github.com/aws/sagemaker-rl-container>`_.

0 commit comments

Comments
 (0)