Skip to content

Commit 70042b8

Browse files
annaluo676yijiezh
authored andcommitted
Add README for RL directory; Typo fix in network compression README file (#940)
* Add README for RL directory; Typo fix in network compression README file * Modify README for batch example
1 parent 3d155d8 commit 70042b8

File tree

3 files changed

+26
-8
lines changed

3 files changed

+26
-8
lines changed

reinforcement_learning/README.md

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,26 @@
1-
# Common Reinforcement Learning Examples
1+
# Amazon SageMaker Examples
22

3-
These examples demonstrate how to train reinforcement learning models on SageMaker.
3+
### Common Reinforcement Learning Examples
44

5-
## FAQ
5+
These examples demonstrate how to train reinforcement learning models on SageMaker for a wide range of applications.
6+
7+
- [Contextual Bandit with Live Environment](bandits_statlog_vw_customEnv) illustrates how you can manage your own contextual multi-armed bandit workflow on SageMaker using the built-in [Vowpal Wabbit](https://github.com/VowpalWabbit/vowpal_wabbit) (VW) container to train and deploy contextual bandit models.
8+
- [Cartpole](rl_cartpole_coach) uses SageMaker RL base [docker image](https://github.com/aws/sagemaker-rl-container) to balance a broom upright.
9+
- [Cartpole Batch](rl_cartpole_batch_coach) uses batch RL techniques to train Cartpole with offline data.
10+
- [Cartpole Spot Training](rl_managed_spot_cartpole_coach) uses SageMaker Managed Spot instances at a lower cost.
11+
- [DeepRacer](rl_deepracer_robomaker_coach_gazebo) gives a glimse of architecture used to get the DeepRacer working with AWS RoboMaker.
12+
- [HVAC](rl_hvac_coach_energyplus) optimizes energy use based on the [EnergyPlus](https://energyplus.net/) simulator.
13+
- [Knapsack](rl_knapsack_coach_custom) is an example of using RL to address operations research problem.
14+
- [Mountain Car](rl_mountain_car_coach_gymEnv) is a classic control RL problem, in which an under-powered car is tasked with climbing a steep mountain, and is only successful when it reaches the top.
15+
- [Network Compression](rl_network_compression_ray_custom) reduces the size of a trained network using a RL algorithm.
16+
- [Object Tracker](rl_objecttracker_robomaker_coach_gazebo) trains a TurtleBot object tracker using Amazon SageMaker RL coupled with AWS RoboMaker.
17+
- [Portfolio Management](rl_portfolio_management_coach_customEnv) shows how to re-distribute a capital into a set of different financial assets using RL algorithms.
18+
- [Predictive Auto-scaling](rl_predictive_autoscaling_coach_customEnv) scales a production service via RL approach by adding and removing resources in reaction to dynamically changing load.
19+
- [Resource Allocation](rl_resource_allocation_ray_customEnv) solves three canonical online and stochastic decision making problems using RL algorithms.
20+
- [Roboschool Ray](rl_roboschool_ray) demonstrates how to use [Ray](https://rise.cs.berkeley.edu/projects/ray/) to scale RL training in different ways, and how to leverage SageMaker's Automatic Model Tuning functionality to optimize the training of an RL model.
21+
- [Roboschool Stable Baseline](rl_roboschool_stable_baselines) is an example of using [stable-baselines](https://stable-baselines.readthedocs.io/en/master/) to train RL algorithms.
22+
- [Tic-tac-toe](rl_tic_tac_toe_coach_customEnv) uses RL to train a policy and then plays locally and interactively within the notebook.
23+
- [Traveling Salesman and Vehicle Routing](rl_traveling_salesman_vehicle_routing_coach) is an example of using RL to address operations research problems.
24+
25+
### FAQ
626
https://github.com/awslabs/amazon-sagemaker-examples#faq

reinforcement_learning/rl_cartpole_batch_coach/README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# Training Batch Reinforcement Learning Policies with Amazon SageMaker RL
22

3-
In many real-world problems, the reinforcement learning agent cannot interact with neither the real environment nor a simulated one. On one hand, creating a simulator that imitates the real environment dynamic could be quite complex and on the other, letting the learning agent attempt sub-optimal actions in the real world is quite risky. In such cases, the learning agent can only have access to batches of offline data that generated by some deployed policy. The learning agent need to utilize these data correctly to learn a better policy to solve the problem.
4-
5-
This notebook shows an example of how to use batch reinforcement learning techniques to address such type of real-world problems: training a new policy from offline dataset when there is no way to interact with real environments or simulators. This example is a simple toy demonstrating how one might begin to address this real and challenging problem. We use gym `CartPole-v0` as a fake simulated system to generate offline dataset and the RL agents are trained using Amazon SageMaker RL.
3+
For many real-world problems, the reinforcement learning (RL) agent needs to learn from historical data that was generated by some deployed policy. For example, we may have historical data of experts playing games, users interacting with a website or sensor data from a control system. This notebook shows an example of how to use batch RL to train a new policy from offline dataset. We use gym `CartPole-v0` as a fake simulated system to generate offline dataset and the RL agents are trained using Amazon SageMaker RL.
64

75
## Contents
86

reinforcement_learning/rl_network_compression_ray_custom/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22

33
## What is network compression?
44

5-
Network compression is the process of reducing the size of a trained network, either by removing certain layers or by shrinking layers, while maintaining performance. This notebook implements the a version of network compression using reinforcement learning algorithm similar to the one proposed in [1].
5+
Network compression is the process of reducing the size of a trained network, either by removing certain layers or by shrinking layers, while maintaining performance. This notebook implements a version of network compression using reinforcement learning algorithm similar to the one proposed in [1].
66

77
[1] [Ashok, Anubhav, Nicholas Rhinehart, Fares Beainy, and Kris M. Kitani. "N2N learning: network to network compression via policy gradient reinforcement learning." arXiv preprint arXiv:1709.06030 (2017)]([https://arxiv.org/abs/1709.06030]).
88

99
## This Example
1010

11-
In this example the network compression notebook uses a Sagemaker docker image containing Ray, tensorflow and OpenAI Gym. The network modification module is
11+
In this example the network compression notebook uses a Sagemaker docker image containing Ray, TensorFlow and OpenAI Gym. The network modification module is
1212
treated as a simulation where the actions produced by reinforcement learning algorithm (remove, shrink, etc.) can be run. The notebook has defined a set of actions for each module. It
1313
demonstrates how one can use the SageMaker Python SDK `script` mode with a `Tensorflow+Ray+Gym` container. You can run
1414
`rl_network_compression_a3c_ray_tensorflow_NetworkCompressionEnv.ipynb` from a SageMaker notebook instance.

0 commit comments

Comments
 (0)