Skip to content

Commit 897365f

Browse files
authored
Merge branch 'master' into fix/jumpstart-amt-tracking
2 parents a26590b + 617bfab commit 897365f

File tree

7 files changed

+126
-21
lines changed

7 files changed

+126
-21
lines changed

doc/api/training/sdp_versions/latest.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ depending on the version of the library you use.
2626
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-use-python-skd-api>`_
2727
for more information.
2828

29-
Version 1.4.0 (Latest)
30-
======================
29+
Version 1.4.0, 1.4.1 (Latest)
30+
=============================
3131

3232
.. toctree::
3333
:maxdepth: 1

doc/api/training/sdp_versions/v1.2.x/smd_data_parallel_pytorch.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -266,7 +266,7 @@ PyTorch API
266266
.. note::
267267

268268
The ``no_sync()`` context manager is available from smdistributed-dataparallel v1.2.2.
269-
To find the release note, see :ref:`sdp_1.2.2_release_note`.
269+
To find the release note, see :ref:`sdp_release_note`.
270270

271271
**Example:**
272272

doc/api/training/smd_data_parallel_release_notes/smd_data_parallel_change_log.rst

Lines changed: 38 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. _sdp_1.2.2_release_note:
1+
.. _sdp_release_note:
22

33
#############
44
Release Notes
@@ -7,9 +7,45 @@ Release Notes
77
New features, bug fixes, and improvements are regularly made to the SageMaker
88
distributed data parallel library.
99

10-
SageMaker Distributed Data Parallel 1.4.0 Release Notes
10+
SageMaker Distributed Data Parallel 1.4.1 Release Notes
1111
=======================================================
1212

13+
*Date: May. 3. 2022*
14+
15+
**Currency Updates**
16+
17+
* Added support for PyTorch 1.11.0
18+
19+
**Known Issues**
20+
21+
* The library currently does not support the PyTorch sub-process groups API (torch.distributed.new_group (https://pytorch.org/docs/stable/distributed.html#torch.distributed.new_group)).
22+
23+
24+
**Migration to AWS Deep Learning Containers**
25+
26+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
27+
28+
- PyTorch 1.11.0 DLC
29+
30+
.. code::
31+
32+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.11.0-gpu-py38-cu113-ubuntu20.04-sagemaker
33+
34+
Binary file of this version of the library for custom container users:
35+
36+
.. code::
37+
38+
https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.11.0/cu113/2022-04-14/smdistributed_dataparallel-1.4.1-cp38-cp38-linux_x86_64.whl
39+
40+
41+
----
42+
43+
Release History
44+
===============
45+
46+
SageMaker Distributed Data Parallel 1.4.0 Release Notes
47+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48+
1349
*Date: Feb. 24. 2022*
1450

1551
**New Features**
@@ -72,11 +108,6 @@ This version passed benchmark testing and is migrated to the following AWS Deep
72108
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.10.2-gpu-py38-cu113-ubuntu20.04-sagemaker
73109
74110
75-
----
76-
77-
Release History
78-
===============
79-
80111
SageMaker Distributed Data Parallel 1.2.2 Release Notes
81112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
82113

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

Lines changed: 34 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,41 @@ Release Notes
55
New features, bug fixes, and improvements are regularly made to the SageMaker
66
distributed model parallel library.
77

8-
SageMaker Distributed Model Parallel 1.8.1 Release Notes
8+
SageMaker Distributed Model Parallel 1.9.0 Release Notes
99
========================================================
1010

11+
*Date: May. 3. 2022*
12+
13+
**Currency Updates**
14+
15+
* Added support for PyTorch 1.11.0
16+
17+
**Migration to AWS Deep Learning Containers**
18+
19+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
20+
21+
- PyTorch 1.11.0 DLC
22+
23+
.. code::
24+
25+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.11.0-gpu-py38-cu113-ubuntu20.04-sagemaker
26+
27+
Binary file of this version of the library for custom container users:
28+
29+
.. code::
30+
31+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-04-20-17-05/smdistributed_modelparallel-1.9.0-cp38-cp38-linux_x86_64.whl
32+
33+
34+
35+
----
36+
37+
Release History
38+
===============
39+
40+
SageMaker Distributed Model Parallel 1.8.1 Release Notes
41+
--------------------------------------------------------
42+
1143
*Date: April. 23. 2022*
1244

1345
**New Features**
@@ -59,11 +91,6 @@ This version passed benchmark testing and is migrated to the following AWS Deep
5991
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.10.0/build-artifacts/2022-04-14-03-58/smdistributed_modelparallel-1.8.1-cp38-cp38-linux_x86_64.whl
6092
6193
62-
----
63-
64-
Release History
65-
===============
66-
6794
SageMaker Distributed Model Parallel 1.8.0 Release Notes
6895
--------------------------------------------------------
6996

@@ -91,7 +118,7 @@ This version passed benchmark testing and is migrated to the following AWS Deep
91118
763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
92119
93120
94-
* The binary file of this version of the library for custom container users
121+
The binary file of this version of the library for custom container users:
95122

96123
.. code::
97124

doc/api/training/smp_versions/latest.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ depending on which version of the library you need to use.
1010
To use the library, reference the
1111
**Common API** documentation alongside the framework specific API documentation.
1212

13-
Version 1.7.0, 1.8.0, 1.8.1 (Latest)
14-
====================================
13+
Version 1.7.0, 1.8.0, 1.8.1, 1.9.0 (Latest)
14+
===========================================
1515

1616
To use the library, reference the Common API documentation alongside the framework specific API documentation.
1717

src/sagemaker/fw_utils.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
import logging
1717
import os
1818
import re
19+
import time
1920
import shutil
2021
import tempfile
2122
from collections import namedtuple
@@ -24,6 +25,7 @@
2425
import sagemaker.image_uris
2526
from sagemaker.session_settings import SessionSettings
2627
import sagemaker.utils
28+
from sagemaker.workflow import is_pipeline_variable
2729

2830
from sagemaker.deprecations import renamed_warning
2931

@@ -395,8 +397,10 @@ def model_code_key_prefix(code_location_key_prefix, model_name, image):
395397
Returns:
396398
str: the key prefix to be used in uploading code
397399
"""
398-
training_job_name = sagemaker.utils.name_from_image(image)
399-
return "/".join(filter(None, [code_location_key_prefix, model_name or training_job_name]))
400+
name_from_image = f"/model_code/{int(time.time())}"
401+
if not is_pipeline_variable(image):
402+
name_from_image = sagemaker.utils.name_from_image(image)
403+
return "/".join(filter(None, [code_location_key_prefix, model_name or name_from_image]))
400404

401405

402406
def warn_if_parameter_server_with_multi_gpu(training_instance_type, distribution):

tests/unit/sagemaker/workflow/test_model_step.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,8 @@
4646
SageMakerJobStepRetryPolicy,
4747
)
4848
from sagemaker.xgboost import XGBoostModel
49+
from sagemaker.lambda_helper import Lambda
50+
from sagemaker.workflow.lambda_step import LambdaStep, LambdaOutput, LambdaOutputTypeEnum
4951
from tests.unit import DATA_DIR
5052
from tests.unit.sagemaker.workflow.helpers import CustomStep
5153

@@ -844,3 +846,44 @@ def _verify_register_model_container_definition(
844846
if submit_dir and not submit_dir.startswith("s3://"):
845847
# exclude the s3 path assertion as it contains timestamp
846848
assert submit_dir == expected_submit_dir
849+
850+
851+
def test_model_step_with_lambda_property_reference(pipeline_session):
852+
lambda_step = LambdaStep(
853+
name="MyLambda",
854+
lambda_func=Lambda(
855+
function_arn="arn:aws:lambda:us-west-2:123456789012:function:sagemaker_test_lambda"
856+
),
857+
outputs=[
858+
LambdaOutput(output_name="model_image", output_type=LambdaOutputTypeEnum.String),
859+
LambdaOutput(output_name="model_artifact", output_type=LambdaOutputTypeEnum.String),
860+
],
861+
)
862+
863+
model = PyTorchModel(
864+
name="MyModel",
865+
framework_version="1.8.0",
866+
py_version="py3",
867+
image_uri=lambda_step.properties.Outputs["model_image"],
868+
model_data=lambda_step.properties.Outputs["model_artifact"],
869+
sagemaker_session=pipeline_session,
870+
entry_point=f"{DATA_DIR}/{_SCRIPT_NAME}",
871+
role=_ROLE,
872+
)
873+
874+
step_create_model = ModelStep(name="mymodelstep", step_args=model.create())
875+
876+
pipeline = Pipeline(
877+
name="MyPipeline",
878+
steps=[lambda_step, step_create_model],
879+
sagemaker_session=pipeline_session,
880+
)
881+
steps = json.loads(pipeline.definition())["Steps"]
882+
repack_step = steps[1]
883+
assert repack_step["Arguments"]["InputDataConfig"][0]["DataSource"]["S3DataSource"][
884+
"S3Uri"
885+
] == {"Get": "Steps.MyLambda.OutputParameters['model_artifact']"}
886+
register_step = steps[2]
887+
assert register_step["Arguments"]["PrimaryContainer"]["Image"] == {
888+
"Get": "Steps.MyLambda.OutputParameters['model_image']"
889+
}

0 commit comments

Comments
 (0)