Skip to content

Commit bd2d7f3

Browse files
author
Talia Chopra
committed
documentation: adding note about CUDA 11 to SMP. Small title update PyTorch
1 parent 791bf0a commit bd2d7f3

File tree

3 files changed

+12
-4
lines changed

3 files changed

+12
-4
lines changed

doc/api/training/smd_data_parallel.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ with multiple GPUs. As the cluster size increases, so does the significant drop
2020
in performance. This drop in performance is primarily caused the communications
2121
overhead between nodes in a cluster.
2222

23-
2423
.. rubric:: Customize your training script
2524

2625
To customize your own training script, you will need the following:

doc/api/training/smd_model_parallel.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,15 @@ across multiple GPUs with minimal code changes. The SMP API can be accessed thro
1111

1212
Use the following sections to learn more about the model parallelism and the SMP library.
1313

14+
.. important::
15+
SMP only supports training jobs using CUDA 11. When you define a PyTorch or TensorFlow
16+
``Estimator`` with ``smdistributed`` ``enabled``,
17+
it uses CUDA 11. When you extend or customize your own training image
18+
you must use a CUDA 11 base image. See
19+
`Extend or Adapt A Docker Container that Contains SMP
20+
<https://integ-docs-aws.amazon.com/sagemaker/latest/dg/model-parallel-use-api.html#model-parallel-customize-container>`__
21+
for more information.
22+
1423
It is recommended to use this documentation alongside `SageMaker Distributed Model Parallel
1524
<http://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel.html>`__ in the Amazon SageMaker
1625
developer guide. This developer guide documentation includes:

doc/frameworks/pytorch/using_pytorch.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
###########################################
2-
Using PyTorch with the SageMaker Python SDK
3-
###########################################
1+
#########################################
2+
Use PyTorch with the SageMaker Python SDK
3+
#########################################
44

55
With PyTorch Estimators and Models, you can train and host PyTorch models on Amazon SageMaker.
66

0 commit comments

Comments
 (0)