Docs (aws#96)

jarednielsen · rahul003 · commit 4cfc85e20719 · 2019-12-05T23:46:35.000-08:00
* Fix yet another flaky test that doesn't do what it should

* beef up pt and mxnet
diff --git a/docs/mxnet.md b/docs/mxnet.md
@@ -1,12 +1,44 @@
 # MXNet
 
-SageMaker Zero-Code-Change supported container: MXNet 1.6. See [AWS Docs](https://docs.aws.amazon.com/sagemaker/latest/dg/train-model.html) for more information.\
-Python API supported versions: MXNet 1.4, 1.5, 1.6.
-
 ## Contents
+- [Support](#support)
+- [How to Use](#how-to-use)
 - [Example](#mxnet-example)
 - [Full API](#full-api)
 
+---
+
+## Support
+
+### Versions
+- Zero Script Change experience where you need no modifications to your training script is supported in the official [SageMaker Framework Container for MXNet 1.6](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html), or the [AWS Deep Learning Container for MXNet 1.6](https://aws.amazon.com/machine-learning/containers/).
+
+- This library itself supports the following versions when you use our API which requires a few minimal changes to your training script: MXNet 1.4, 1.5, 1.6.
+
+---
+
+## How to Use
+### Using Zero Script Change containers
+In this case, you don't need to do anything to get the hook running. You are encouraged to configure the hook from the SageMaker python SDK so you can run different jobs with different configurations without having to modify your script. If you want access to the hook to configure certain things which can not be configured through the SageMaker SDK, you can retrieve the hook as follows.
+```
+import smdebug.mxnet as smd
+hook = smd.Hook.create_from_json_file()
+```
+Note that you can create the hook from smdebug's python API as is being done in the next section even in such containers.
+
+### Bring your own container experience
+#### 1. Create a hook
+If using SageMaker, you will configure the hook in SageMaker's python SDK using the Estimator class. Instantiate it with
+`smd.Hook.create_from_json_file()`. Otherwise, call the hook class constructor, `smd.Hook()`.
+
+#### 2. Register the model to the hook
+Call `hook.register_block(net)`.
+
+#### 3. (Optional) Configure Collections, SaveConfig and ReductionConfig
+See the [Common API](api.md) page for details on how to do this.
+
+---
+
 ## MXNet Example
 ```python
 import smdebug.mxnet as smd
diff --git a/docs/pytorch.md b/docs/pytorch.md
@@ -1,13 +1,47 @@
 # PyTorch
 
-SageMaker Zero-Code-Change supported containers: PyTorch 1.3. See [AWS Docs](https://docs.aws.amazon.com/sagemaker/latest/dg/train-model.html) for more information.\
-Python API supported versions: 1.2, 1.3.
-
 ## Contents
+- [Support](#support)
+- [How to Use](#how-to-use)
 - [Module Loss Example](#module-loss-example)
 - [Functional Loss Example](#functional-loss-example)
 - [Full API](#full-api)
 
+## Support
+
+### Versions
+- Zero Script Change experience where you need no modifications to your training script is supported in the official [SageMaker Framework Container for PyTorch 1.3](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html), or the [AWS Deep Learning Container for PyTorch 1.3](https://aws.amazon.com/machine-learning/containers/).
+
+- The library itself supports the following versions when using changes to the training script: PyTorch 1.2, 1.3.
+
+---
+
+## How to Use
+### Using Zero Script Change containers
+In this case, you don't need to do anything to get the hook running. You are encouraged to configure the hook from the SageMaker python SDK so you can run different jobs with different configurations without having to modify your script. If you want access to the hook to configure certain things which can not be configured through the SageMaker SDK, you can retrieve the hook as follows.
+```
+import smdebug.pytorch as smd
+hook = smd.Hook.create_from_json_file()
+```
+Note that you can create the hook from smdebug's python API as is being done in the next section even in such containers.
+
+### Bring your own container experience
+#### 1. Create a hook
+If using SageMaker, you will configure the hook in SageMaker's python SDK using the Estimator class. Instantiate it with
+`smd.Hook.create_from_json_file()`. Otherwise, call the hook class constructor, `smd.Hook()`.
+
+#### 2. Register the model to the hook
+Call `hook.register_module(net)`.
+
+#### 3. Register your loss function to the hook
+If using a loss which is a subclass of `nn.Module`, call `hook.register_loss(loss_criterion)` once before starting training.\
+If using a loss which is a subclass of `nn.functional`, call `hook.record_tensor_value(loss)` after each training step.
+
+#### 4. (Optional) Configure Collections, SaveConfig and ReductionConfig
+See the [Common API](api.md) page for details on how to do this.
+
+---
+
 ## Module Loss Example
 ```python
 import smdebug.pytorch as smd
@@ -38,6 +72,8 @@ for (inputs, labels) in trainloader:
     optimizer.step()
 ```
 
+---
+
 ## Functional Loss Example
 ```python
 import smdebug.pytorch as smd
@@ -70,6 +106,8 @@ for (inputs, labels) in trainloader:
     optimizer.step()
 ```
 
+---
+
 ## Full API
 See the [Common API](api.md) page for details about Collection, SaveConfig, and ReductionConfig.\
 See the [Analysis](analysis.md) page for details about analyzing a training job.
diff --git a/tests/mxnet/test_training_end.py b/tests/mxnet/test_training_end.py
@@ -34,9 +34,9 @@ def test_end_local_training():
 @pytest.mark.slow  # 0:04 to run
 def test_end_s3_training():
     run_id = str(uuid.uuid4())
-    bucket = "smdebugcodebuildtest"
-    key = "newlogsRunTest/" + run_id
-    out_dir = bucket + "/" + key
+    bucket = "smdebug-testing"
+    key = f"outputs/{uuid.uuid4()}"
+    out_dir = "s3://" + bucket + "/" + key
     assert has_training_ended(out_dir) == False
     subprocess.check_call(
         [