Skip to content

Commit 67f60b7

Browse files
Merge branch 'master' into job-step-subclass
2 parents bdb5cbe + 8d84618 commit 67f60b7

File tree

22 files changed

+2069
-62
lines changed

22 files changed

+2069
-62
lines changed

CHANGELOG.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,19 @@
11
# Changelog
22

3+
## v2.87.0 (2022-04-20)
4+
5+
### Features
6+
7+
* Add Jumpstart example notebooks
8+
* add Tensorflow and Pytorch version for SM Training Compiler and expand to regular regions
9+
10+
### Bug Fixes and Other Changes
11+
12+
* integs for training compiler in non-PDX regions
13+
* TrainingStep cache misses due to timestamp based job name
14+
* retry context delete
15+
* Add more logging when unexpected number of artifacts found
16+
317
## v2.86.2 (2022-04-14)
418

519
### Bug Fixes and Other Changes
@@ -165,7 +179,7 @@
165179
### Features
166180

167181
* override jumpstart content bucket
168-
* jumpstart model id suggestions
182+
* jumpstart model ID suggestions
169183
* adding customer metadata support to registermodel step
170184

171185
### Bug Fixes and Other Changes

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.86.3.dev0
1+
2.87.1.dev0

doc/api/training/distributed.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The SageMaker Distributed Data Parallel Library
1010
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1111

1212
.. toctree::
13-
:maxdepth: 3
13+
:maxdepth: 2
1414

1515
smd_data_parallel
1616
sdp_versions/latest
@@ -23,7 +23,7 @@ The SageMaker Distributed Model Parallel Library
2323
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2424

2525
.. toctree::
26-
:maxdepth: 3
26+
:maxdepth: 2
2727

2828
smd_model_parallel
2929
smp_versions/latest

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

Lines changed: 40 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,48 @@ Release Notes
55
New features, bug fixes, and improvements are regularly made to the SageMaker
66
distributed model parallel library.
77

8-
SageMaker Distributed Model Parallel 1.7.0 Release Notes
8+
SageMaker Distributed Model Parallel 1.8.0 Release Notes
99
========================================================
1010

11+
*Date: March. 23. 2022*
12+
13+
**New Features**
14+
15+
* Added tensor parallelism support for the `GPT-J model
16+
<https://huggingface.co/docs/transformers/model_doc/gptj>`_.
17+
When using the GPT-J model of Hugging Face Transformers v4.17.0 with
18+
tensor parallelism, the SageMaker model parallel library automatically
19+
replaces the model with a tensor parallel distributed GPT-J model.
20+
For more information, see `Support for Hugging Face Transformer Models
21+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-hugging-face.html>`_
22+
in the *Amazon SageMaker Model Parallel Training developer guide*.
23+
24+
**Migration to AWS Deep Learning Containers**
25+
26+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers:
27+
28+
* HuggingFace 4.17.0 DLC with PyTorch 1.10.2
29+
30+
.. code::
31+
32+
763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
33+
34+
35+
The binary file of this version of the library for custom container users:
36+
37+
.. code::
38+
39+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.10.0/build-artifacts/2022-03-12-00-33/smdistributed_modelparallel-1.8.0-cp38-cp38-linux_x86_64.whl
40+
41+
42+
----
43+
44+
Release History
45+
===============
46+
47+
SageMaker Distributed Model Parallel 1.7.0 Release Notes
48+
--------------------------------------------------------
49+
1150
*Date: March. 07. 2022*
1251

1352
**Currency Updates**
@@ -49,11 +88,6 @@ This version passed benchmark testing and is migrated to the following AWS Deep
4988
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.10.2-gpu-py38-cu113-ubuntu20.04-sagemaker
5089
5190
52-
----
53-
54-
Release History
55-
===============
56-
5791
SageMaker Distributed Model Parallel 1.6.0 Release Notes
5892
--------------------------------------------------------
5993

doc/api/training/smp_versions/latest.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ depending on which version of the library you need to use.
1010
To use the library, reference the
1111
**Common API** documentation alongside the framework specific API documentation.
1212

13-
Version 1.7.0 (Latest)
14-
======================
13+
Version 1.7.0, 1.8.0 (Latest)
14+
=============================
1515

1616
To use the library, reference the Common API documentation alongside the framework specific API documentation.
1717

doc/doc_utils/jumpstart_doc_utils.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,9 +122,9 @@ def create_jumpstart_model_table():
122122
file_content.append("==================================\n")
123123
file_content.append(
124124
"""
125-
JumpStart for the SageMaker Python SDK uses model ids and model versions to access the necessary
125+
JumpStart for the SageMaker Python SDK uses model IDs and model versions to access the necessary
126126
utilities. This table serves to provide the core material plus some extra information that can be useful
127-
in selecting the correct model id and corresponding parameters.\n
127+
in selecting the correct model ID and corresponding parameters.\n
128128
"""
129129
)
130130
file_content.append(

doc/overview.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -670,7 +670,7 @@ the ``model_id`` and ``model_version`` needed to retrieve the URI.
670670
model. To use the latest version, enter ``"*"``. This is a
671671
required parameter.
672672
673-
To retrieve a model, first select a ``model id`` and ``version`` from
673+
To retrieve a model, first select a ``model ID`` and ``version`` from
674674
the :doc:`available models <./doc_utils/jumpstart>`.
675675

676676
.. code:: python

src/sagemaker/jumpstart/accessors.py

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
# language governing permissions and limitations under the License.
1313
"""This module contains accessors related to SageMaker JumpStart."""
1414
from __future__ import absolute_import
15-
from typing import Any, Dict, Optional
15+
from typing import Any, Dict, List, Optional
1616
from sagemaker.jumpstart.types import JumpStartModelHeader, JumpStartModelSpecs
1717
from sagemaker.jumpstart import cache
1818
from sagemaker.jumpstart.constants import JUMPSTART_DEFAULT_REGION_NAME
@@ -84,8 +84,8 @@ def get_model_header(region: str, model_id: str, version: str) -> JumpStartModel
8484
8585
Args:
8686
region (str): region for which to retrieve header.
87-
model_id (str): model id to retrieve.
88-
version (str): semantic version to retrieve for the model id.
87+
model_id (str): model ID to retrieve.
88+
version (str): semantic version to retrieve for the model ID.
8989
"""
9090
cache_kwargs = JumpStartModelsAccessor._validate_and_mutate_region_cache_kwargs(
9191
JumpStartModelsAccessor._cache_kwargs, region
@@ -101,8 +101,8 @@ def get_model_specs(region: str, model_id: str, version: str) -> JumpStartModelS
101101
102102
Args:
103103
region (str): region for which to retrieve header.
104-
model_id (str): model id to retrieve.
105-
version (str): semantic version to retrieve for the model id.
104+
model_id (str): model ID to retrieve.
105+
version (str): semantic version to retrieve for the model ID.
106106
"""
107107
cache_kwargs = JumpStartModelsAccessor._validate_and_mutate_region_cache_kwargs(
108108
JumpStartModelsAccessor._cache_kwargs, region
@@ -150,3 +150,22 @@ def reset_cache(cache_kwargs: Dict[str, Any] = None, region: Optional[str] = Non
150150
"""
151151
cache_kwargs_dict = {} if cache_kwargs is None else cache_kwargs
152152
JumpStartModelsAccessor.set_cache_kwargs(cache_kwargs_dict, region)
153+
154+
@staticmethod
155+
def get_manifest(
156+
cache_kwargs: Optional[Dict[str, Any]] = None, region: Optional[str] = None
157+
) -> List[JumpStartModelHeader]:
158+
"""Return entire JumpStart models manifest.
159+
160+
Raises:
161+
ValueError: If region in `cache_kwargs` is inconsistent with `region` argument.
162+
163+
Args:
164+
cache_kwargs (Dict[str, Any]): Optional. Cache kwargs to use.
165+
(Default: None).
166+
region (str): Optional. The region to use for the cache.
167+
(Default: None).
168+
"""
169+
cache_kwargs_dict: Dict[str, Any] = {} if cache_kwargs is None else cache_kwargs
170+
JumpStartModelsAccessor.set_cache_kwargs(cache_kwargs_dict, region)
171+
return JumpStartModelsAccessor._cache.get_manifest() # type: ignore

src/sagemaker/jumpstart/cache.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -146,16 +146,16 @@ def _get_manifest_key_from_model_id_semantic_version(
146146
key: JumpStartVersionedModelId,
147147
value: Optional[JumpStartVersionedModelId], # pylint: disable=W0613
148148
) -> JumpStartVersionedModelId:
149-
"""Return model id and version in manifest that matches semantic version/id.
149+
"""Return model ID and version in manifest that matches semantic version/id.
150150
151151
Uses ``packaging.version`` to perform version comparison. The highest model version
152152
matching the semantic version is used, which is compatible with the SageMaker
153153
version.
154154
155155
Args:
156-
key (JumpStartVersionedModelId): Key for which to fetch versioned model id.
156+
key (JumpStartVersionedModelId): Key for which to fetch versioned model ID.
157157
value (Optional[JumpStartVersionedModelId]): Unused variable for current value of
158-
old cached model id/version.
158+
old cached model ID/version.
159159
160160
Raises:
161161
KeyError: If the semantic version is not found in the manifest, or is found but
@@ -287,10 +287,10 @@ def get_manifest(self) -> List[JumpStartModelHeader]:
287287
return manifest
288288

289289
def get_header(self, model_id: str, semantic_version_str: str) -> JumpStartModelHeader:
290-
"""Return header for a given JumpStart model id and semantic version.
290+
"""Return header for a given JumpStart model ID and semantic version.
291291
292292
Args:
293-
model_id (str): model id for which to get a header.
293+
model_id (str): model ID for which to get a header.
294294
semantic_version_str (str): The semantic version for which to get a
295295
header.
296296
"""
@@ -331,7 +331,7 @@ def _get_header_impl(
331331
Allows a single retry if the cache is old.
332332
333333
Args:
334-
model_id (str): model id for which to get a header.
334+
model_id (str): model ID for which to get a header.
335335
semantic_version_str (str): The semantic version for which to get a
336336
header.
337337
attempt (int): attempt number at retrieving a header.
@@ -353,10 +353,10 @@ def _get_header_impl(
353353
return self._get_header_impl(model_id, semantic_version_str, attempt + 1)
354354

355355
def get_specs(self, model_id: str, semantic_version_str: str) -> JumpStartModelSpecs:
356-
"""Return specs for a given JumpStart model id and semantic version.
356+
"""Return specs for a given JumpStart model ID and semantic version.
357357
358358
Args:
359-
model_id (str): model id for which to get specs.
359+
model_id (str): model ID for which to get specs.
360360
semantic_version_str (str): The semantic version for which to get
361361
specs.
362362
"""
@@ -369,6 +369,6 @@ def get_specs(self, model_id: str, semantic_version_str: str) -> JumpStartModelS
369369
return specs # type: ignore
370370

371371
def clear(self) -> None:
372-
"""Clears the model id/version and s3 cache."""
372+
"""Clears the model ID/version and s3 cache."""
373373
self._s3_cache.clear()
374374
self._model_id_semantic_version_manifest_key_cache.clear()

src/sagemaker/jumpstart/exceptions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ def __init__(
4949
"""Instantiates VulnerableJumpStartModelError exception.
5050
5151
Args:
52-
model_id (Optional[str]): model id of vulnerable JumpStart model.
52+
model_id (Optional[str]): model ID of vulnerable JumpStart model.
5353
(Default: None).
5454
version (Optional[str]): version of vulnerable JumpStart model.
5555
(Default: None).

0 commit comments

Comments
 (0)