You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Make RLEstimator() PyTorch compatible & modify cartpole notebook
* set use_pytorch to False by default
* minor refactor; check in first unit test
* indent correction
Copy file name to clipboardExpand all lines: reinforcement_learning/README.md
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,8 @@ These examples demonstrate how to train reinforcement learning models on SageMak
6
6
7
7
**IMPORTANT for rllib users:** Some examples may break with latest [rllib](https://docs.ray.io/en/latest/rllib.html) due to breaking API changes. Please refer to [Amazon SageMaker RL Container](https://github.com/aws/sagemaker-rl-container) for the latest public images and modify the configs in entrypoint scripts according to [rllib algorithm config](https://docs.ray.io/en/latest/rllib-algorithms.html).
8
8
9
+
If you are using PyTorch rather than TensorFlow, please set `debugger_hook_config=False` when calling `RLEstimator()` to avoid TensorBoard conflicts.
10
+
9
11
-[Contextual Bandit with Live Environment](bandits_statlog_vw_customEnv) illustrates how you can manage your own contextual multi-armed bandit workflow on SageMaker using the built-in [Vowpal Wabbit](https://github.com/VowpalWabbit/vowpal_wabbit) (VW) container to train and deploy contextual bandit models.
10
12
-[Cartpole](rl_cartpole_coach) uses SageMaker RL base [docker image](https://github.com/aws/sagemaker-rl-container) to balance a broom upright.
11
13
-[Cartpole Batch](rl_cartpole_batch_coach) uses batch RL techniques to train Cartpole with offline data.
Copy file name to clipboardExpand all lines: reinforcement_learning/rl_cartpole_ray/rl_cartpole_ray_gymEnv.ipynb
+16-17Lines changed: 16 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@
9
9
"---\n",
10
10
"## Introduction\n",
11
11
"\n",
12
-
"In this notebook we'll start from the cart-pole balancing problem, where a pole is attached by an un-actuated joint to a cart, moving along a frictionless track. Instead of applying control theory to solve the problem, this example shows how to solve the problem with reinforcement learning on Amazon SageMaker and Ray RLlib\n",
12
+
"In this notebook we'll start from the cart-pole balancing problem, where a pole is attached by an un-actuated joint to a cart, moving along a frictionless track. Instead of applying control theory to solve the problem, this example shows how to solve the problem with reinforcement learning on Amazon SageMaker and Ray RLlib. You can choose either TensorFlow or PyTorch as your underlying DL framework.\n",
13
13
"\n",
14
14
"(For a similar example using Coach library, see this [link](../rl_cartpole_coach/rl_cartpole_coach_gymEnv.ipynb). Another Cart-pole example using Coach library and offline data can be found [here](../rl_cartpole_batch_coach/rl_cartpole_batch_coach.ipynb).)\n",
15
15
"\n",
@@ -196,7 +196,8 @@
196
196
"\n",
197
197
"cpu_or_gpu = 'gpu' if instance_type.startswith('ml.p') else 'cpu'\n",
0 commit comments