Skip to content

Commit 746b943

Browse files
committed
fix index errors and conflicts
1 parent da3f450 commit 746b943

File tree

6 files changed

+75
-69
lines changed

6 files changed

+75
-69
lines changed

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,13 @@ SageMaker Distributed Model Parallel 1.11.0 Release Notes
1616
The following new features are added for PyTorch.
1717

1818
* The library implements sharded data parallelism, which is a memory-saving
19-
distributed training technique that splits the training state of a model
20-
(model parameters, gradients, and optimizer states) across data parallel groups.
21-
With sharded data parallelism, you can reduce the per-GPU memory footprint of
22-
a model by sharding the training state over multiple GPUs. To learn more,
23-
see `Sharded Data Parallelism
24-
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html>`_
25-
in the *Amazon SageMaker Developer Guide*.
19+
distributed training technique that splits the training state of a model
20+
(model parameters, gradients, and optimizer states) across data parallel groups.
21+
With sharded data parallelism, you can reduce the per-GPU memory footprint of
22+
a model by sharding the training state over multiple GPUs. To learn more,
23+
see `Sharded Data Parallelism
24+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html>`_
25+
in the *Amazon SageMaker Developer Guide*.
2626

2727
**Migration to AWS Deep Learning Containers**
2828

@@ -48,7 +48,7 @@ Binary file of this version of the library for `custom container
4848
Release History
4949
===============
5050

51-
SageMaker Distributed Model Parallel 1.11.0 Release Notes
51+
SageMaker Distributed Model Parallel 1.10.1 Release Notes
5252
---------------------------------------------------------
5353

5454
*Date: July. 19. 2022*

doc/api/training/smp_versions/v1.10.0/smd_model_parallel_pytorch.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ smdistributed.modelparallel.torch.DistributedModel
2424
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2525

2626
.. class:: smdistributed.modelparallel.torch.DistributedModel
27-
:noindex:
27+
:noindex:
2828

2929
A sub-class of ``torch.nn.Module`` which specifies the model to be
3030
partitioned. Accepts a ``torch.nn.Module`` object ``module`` which is
@@ -493,13 +493,15 @@ smdistributed.modelparallel.torch.DistributedOptimizer
493493
This wrapper returns an ``optimizer`` object with the following methods overridden:
494494

495495
.. method:: state_dict( )
496+
:noindex:
496497

497498
Returns the ``state_dict`` that contains optimizer state for the entire model.
498499
It first collects the ``local_state_dict`` and gathers and merges
499500
the ``local_state_dict`` from all ``mp_rank``\ s to create a full
500501
``state_dict``.
501502

502503
.. method:: load_state_dict( )
504+
:noindex:
503505

504506
Same as the ``torch.optimizer.load_state_dict()`` , except:
505507
@@ -509,6 +511,7 @@ smdistributed.modelparallel.torch.DistributedOptimizer
509511
rank knows its local parameters.
510512

511513
.. method:: local_state_dict( )
514+
:noindex:
512515

513516
Returns the ``state_dict`` that contains the
514517
local optimizer state that belongs to the current \ ``mp_rank``. This

doc/api/training/smp_versions/v1.10.0/smd_model_parallel_pytorch_tensor_parallel.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -483,6 +483,7 @@ supported modules within that scope. To do this, you can use the
483483
following API:
484484

485485
.. decorator:: smdistributed.modelparallel.torch.tensor_parallelism(enabled=True, **kwargs)
486+
:noindex:
486487

487488
- A context manager that enables or disables tensor parallelism for
488489
any supported module that is created inside. If there are nested

doc/api/training/smp_versions/v1.10.0/smd_model_parallel_tensorflow.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,6 @@ you need to add the following import statement at the top of your training scrip
1717

1818
.. class:: smp.DistributedModel
1919
:noindex:
20-
:noindex:
2120

2221
A sub-class of the Keras \ ``Model`` class, which defines the model to
2322
be partitioned. Model definition is done by sub-classing

doc/conf.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,8 +96,9 @@
9696
# autosectionlabel
9797
autosectionlabel_prefix_document = True
9898

99-
99+
'''
100100
def setup(app):
101101
sys.stdout.write("Generating JumpStart model table...")
102102
sys.stdout.flush()
103103
create_jumpstart_model_table()
104+
'''

doc/doc_utils/pretrainedmodels.rst

Lines changed: 60 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _all-pretrained-models:
2+
13
.. |external-link| raw:: html
24

35
<i class="fa fa-external-link"></i>
@@ -384,176 +386,176 @@ Built-in Algorithms with pre-trained Model Table
384386
- `LightGBM <https://lightgbm.readthedocs.io/en/latest/>`__ |external-link|
385387
* - mxnet-is-mask-rcnn-fpn-resnet101-v1d-coco
386388
- False
387-
- 1.1.0
388-
- 2.75.0
389+
- 1.2.0
390+
- 2.100.0
389391
- Instance Segmentation
390392
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
391393
* - mxnet-is-mask-rcnn-fpn-resnet18-v1b-coco
392394
- False
393-
- 1.1.0
394-
- 2.75.0
395+
- 1.2.0
396+
- 2.100.0
395397
- Instance Segmentation
396398
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
397399
* - mxnet-is-mask-rcnn-fpn-resnet50-v1b-coco
398400
- False
399-
- 1.1.0
400-
- 2.75.0
401+
- 1.2.0
402+
- 2.100.0
401403
- Instance Segmentation
402404
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
403405
* - mxnet-is-mask-rcnn-resnet18-v1b-coco
404406
- False
405-
- 1.1.0
406-
- 2.75.0
407+
- 1.2.0
408+
- 2.100.0
407409
- Instance Segmentation
408410
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
409411
* - mxnet-od-faster-rcnn-fpn-resnet101-v1d-coco
410412
- False
411-
- 1.1.0
412-
- 2.75.0
413+
- 1.2.0
414+
- 2.100.0
413415
- Object Detection
414416
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
415417
* - mxnet-od-faster-rcnn-fpn-resnet50-v1b-coco
416418
- False
417-
- 1.1.0
418-
- 2.75.0
419+
- 1.2.0
420+
- 2.100.0
419421
- Object Detection
420422
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
421423
* - mxnet-od-faster-rcnn-resnet101-v1d-coco
422424
- False
423-
- 1.1.0
424-
- 2.75.0
425+
- 1.2.0
426+
- 2.100.0
425427
- Object Detection
426428
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
427429
* - mxnet-od-faster-rcnn-resnet50-v1b-coco
428430
- False
429-
- 1.1.0
430-
- 2.75.0
431+
- 1.2.0
432+
- 2.100.0
431433
- Object Detection
432434
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
433435
* - mxnet-od-faster-rcnn-resnet50-v1b-voc
434436
- False
435-
- 1.1.0
436-
- 2.75.0
437+
- 1.2.0
438+
- 2.100.0
437439
- Object Detection
438440
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
439441
* - mxnet-od-ssd-300-vgg16-atrous-coco
440442
- True
441-
- 1.2.3
442-
- 2.75.0
443+
- 1.3.0
444+
- 2.100.0
443445
- Object Detection
444446
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
445447
* - mxnet-od-ssd-300-vgg16-atrous-voc
446448
- True
447-
- 1.2.3
448-
- 2.75.0
449+
- 1.3.0
450+
- 2.100.0
449451
- Object Detection
450452
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
451453
* - mxnet-od-ssd-512-mobilenet1-0-coco
452454
- True
453-
- 1.2.3
454-
- 2.75.0
455+
- 1.3.0
456+
- 2.100.0
455457
- Object Detection
456458
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
457459
* - mxnet-od-ssd-512-mobilenet1-0-voc
458460
- True
459-
- 1.2.3
460-
- 2.75.0
461+
- 1.3.0
462+
- 2.100.0
461463
- Object Detection
462464
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
463465
* - mxnet-od-ssd-512-resnet50-v1-coco
464466
- True
465-
- 1.2.3
466-
- 2.75.0
467+
- 1.3.0
468+
- 2.100.0
467469
- Object Detection
468470
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
469471
* - mxnet-od-ssd-512-resnet50-v1-voc
470472
- True
471-
- 1.2.3
472-
- 2.75.0
473+
- 1.3.0
474+
- 2.100.0
473475
- Object Detection
474476
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
475477
* - mxnet-od-ssd-512-vgg16-atrous-coco
476478
- True
477-
- 1.2.3
478-
- 2.75.0
479+
- 1.3.0
480+
- 2.100.0
479481
- Object Detection
480482
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
481483
* - mxnet-od-ssd-512-vgg16-atrous-voc
482484
- True
483-
- 1.2.3
484-
- 2.75.0
485+
- 1.3.0
486+
- 2.100.0
485487
- Object Detection
486488
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
487489
* - mxnet-od-yolo3-darknet53-coco
488490
- False
489-
- 1.1.0
490-
- 2.75.0
491+
- 1.2.0
492+
- 2.100.0
491493
- Object Detection
492494
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
493495
* - mxnet-od-yolo3-darknet53-voc
494496
- False
495-
- 1.1.0
496-
- 2.75.0
497+
- 1.2.0
498+
- 2.100.0
497499
- Object Detection
498500
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
499501
* - mxnet-od-yolo3-mobilenet1-0-coco
500502
- False
501-
- 1.1.0
502-
- 2.75.0
503+
- 1.2.0
504+
- 2.100.0
503505
- Object Detection
504506
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
505507
* - mxnet-od-yolo3-mobilenet1-0-voc
506508
- False
507-
- 1.1.0
508-
- 2.75.0
509+
- 1.2.0
510+
- 2.100.0
509511
- Object Detection
510512
- `GluonCV <https://cv.gluon.ai/model_zoo/detection.html>`__ |external-link|
511513
* - mxnet-semseg-fcn-resnet101-ade
512514
- True
513-
- 1.3.5
514-
- 2.75.0
515+
- 1.4.0
516+
- 2.100.0
515517
- Semantic Segmentation
516518
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
517519
* - mxnet-semseg-fcn-resnet101-coco
518520
- True
519-
- 1.3.5
520-
- 2.75.0
521+
- 1.4.0
522+
- 2.100.0
521523
- Semantic Segmentation
522524
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
523525
* - mxnet-semseg-fcn-resnet101-voc
524526
- True
525-
- 1.3.5
526-
- 2.75.0
527+
- 1.4.0
528+
- 2.100.0
527529
- Semantic Segmentation
528530
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
529531
* - mxnet-semseg-fcn-resnet50-ade
530532
- True
531-
- 1.3.5
532-
- 2.75.0
533+
- 1.4.0
534+
- 2.100.0
533535
- Semantic Segmentation
534536
- `GluonCV <https://cv.gluon.ai/model_zoo/segmentation.html>`__ |external-link|
535537
* - mxnet-tcembedding-robertafin-base-uncased
536538
- False
537-
- 1.1.0
538-
- 2.75.0
539+
- 1.2.0
540+
- 2.100.0
539541
- Text Embedding
540542
- `GluonCV <https://nlp.gluon.ai/master/_modules/gluonnlp/models/roberta.html>`__ |external-link|
541543
* - mxnet-tcembedding-robertafin-base-wiki-uncased
542544
- False
543-
- 1.1.0
544-
- 2.75.0
545+
- 1.2.0
546+
- 2.100.0
545547
- Text Embedding
546548
- `GluonCV <https://nlp.gluon.ai/master/_modules/gluonnlp/models/roberta.html>`__ |external-link|
547549
* - mxnet-tcembedding-robertafin-large-uncased
548550
- False
549-
- 1.1.0
550-
- 2.75.0
551+
- 1.2.0
552+
- 2.100.0
551553
- Text Embedding
552554
- `GluonCV <https://nlp.gluon.ai/master/_modules/gluonnlp/models/roberta.html>`__ |external-link|
553555
* - mxnet-tcembedding-robertafin-large-wiki-uncased
554556
- False
555-
- 1.1.0
556-
- 2.75.0
557+
- 1.2.0
558+
- 2.100.0
557559
- Text Embedding
558560
- `GluonCV <https://nlp.gluon.ai/master/_modules/gluonnlp/models/roberta.html>`__ |external-link|
559561
* - pytorch-eqa-bert-base-cased

0 commit comments

Comments
 (0)