Skip to content

Commit 408d5a3

Browse files
author
Verdi March
committed
Merge branch 'master' into pr-framework-processor-round-02
2 parents 5ec5378 + 14d65f4 commit 408d5a3

File tree

26 files changed

+808
-44
lines changed

26 files changed

+808
-44
lines changed

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
blank_issues_enabled: false
22
contact_links:
33
- name: Ask a question
4-
url: https://stackoverflow.com/questions/tagged/amazon-sagemaker
5-
about: Use Stack Overflow to ask and answer questions
4+
url: https://github.com/aws/sagemaker-python-sdk/discussions
5+
about: Use GitHub Discussions to ask and answer questions

CHANGELOG.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,30 @@
11
# Changelog
22

3+
## v2.42.1 (2021-05-27)
4+
5+
### Bug Fixes and Other Changes
6+
7+
* default value removed if zero for integer param
8+
9+
## v2.42.0 (2021-05-24)
10+
11+
### Features
12+
13+
* support for custom pipeline execution name
14+
* Add data ingestion only data-wrangler flow recipe generation helper function
15+
16+
### Bug Fixes and Other Changes
17+
18+
* add kms key for processing job code upload
19+
* remove failing notebooks from notebook pr test
20+
* fix in and not in condition bug
21+
* Update overview.rst
22+
23+
### Documentation Changes
24+
25+
* Update "Ask a question" contact link
26+
* Update smdp docs with sparse_as_dense support
27+
328
## v2.41.0 (2021-05-17)
429

530
### Features

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.41.1.dev0
1+
2.42.2.dev0

doc/api/training/sdp_versions/latest/smd_data_parallel_tensorflow.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -443,7 +443,7 @@ TensorFlow API
443443

444444
*  Supported compression types - ``none``, ``fp16``
445445

446-
- ``sparse_as_dense:`` Not supported. Raises not supported error.
446+
- ``sparse_as_dense:`` Treats sparse gradient tensor as dense tensor. Defaults to ``False``.
447447

448448
- ``op (smdistributed.dataparallel.tensorflow.ReduceOp)(optional)``: The reduction operation to combine tensors across different ranks. Defaults to ``Average`` if None is given.
449449

@@ -482,6 +482,8 @@ TensorFlow API
482482

483483
*  Supported compression types - ``none``, ``fp16``
484484

485+
- ``sparse_as_dense:`` Treats sparse gradient tensor as dense tensor. Defaults to ``False``.
486+
485487
- ``op (smdistributed.dataparallel.tensorflow.ReduceOp)(optional)``: The reduction operation to combine tensors across different ranks. Defaults to ``Average`` if None is given.
486488

487489
* Supported ops: ``AVERAGE``

doc/api/training/sdp_versions/v1.0.0/smd_data_parallel_tensorflow.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,7 @@ TensorFlow API
456456

457457
*  Supported compression types - ``none``, ``fp16``
458458

459-
- ``sparse_as_dense:`` Not supported. Raises not supported error.
459+
- ``sparse_as_dense:`` Treats sparse gradient tensor as dense tensor. Defaults to ``False``.
460460

461461
- ``op (smdistributed.dataparallel.tensorflow.ReduceOp)(optional)``: The reduction operation to combine tensors across different ranks. Defaults to ``Average`` if None is given.
462462

@@ -496,6 +496,8 @@ TensorFlow API
496496

497497
*  Supported compression types - ``none``, ``fp16``
498498

499+
- ``sparse_as_dense:`` Treats sparse gradient tensor as dense tensor. Defaults to ``False``.
500+
499501
- ``op (smdistributed.dataparallel.tensorflow.ReduceOp)(optional)``: The reduction operation to combine tensors across different ranks. Defaults to ``Average`` if None is given.
500502

501503
* Supported ops: ``AVERAGE``

doc/api/training/sdp_versions/v1.1.x/smd_data_parallel_tensorflow.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -459,7 +459,7 @@ library with TensorFlow.
459459

460460
*  Supported compression types - ``none``, ``fp16``
461461

462-
- ``sparse_as_dense:`` Not supported. Raises not supported error.
462+
- ``sparse_as_dense:`` Treats sparse gradient tensor as dense tensor. Defaults to ``False``.
463463

464464
- ``op (smdistributed.dataparallel.tensorflow.ReduceOp)(optional)``: The reduction operation to combine tensors across different ranks. Defaults to ``Average`` if None is given.
465465

@@ -499,6 +499,8 @@ library with TensorFlow.
499499

500500
*  Supported compression types - ``none``, ``fp16``
501501

502+
- ``sparse_as_dense:`` Treats sparse gradient tensor as dense tensor. Defaults to ``False``.
503+
502504
- ``op (smdistributed.dataparallel.tensorflow.ReduceOp)(optional)``: The reduction operation to combine tensors across different ranks. Defaults to ``Average`` if None is given.
503505

504506
* Supported ops: ``AVERAGE``

doc/frameworks/sklearn/using_sklearn.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,10 @@ inadvertently run your training code at the wrong point in execution.
8484

8585
For more on training environment variables, please visit https://github.com/aws/sagemaker-containers.
8686

87+
.. important::
88+
The sagemaker-containers repository has been deprecated,
89+
however it is still used to define Scikit-learn and XGBoost environment variables.
90+
8791
Save the Model
8892
--------------
8993

doc/frameworks/xgboost/using_xgboost.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,10 @@ but you can access useful properties about the training environment through vari
8888

8989
For the exhaustive list of available environment variables, see the `SageMaker Containers documentation <https://github.com/aws/sagemaker-containers#list-of-provided-environment-variables-by-sagemaker-containers>`__.
9090

91+
.. important::
92+
The sagemaker-containers repository has been deprecated,
93+
however it is still used to define Scikit-learn and XGBoost environment variables.
94+
9195
Let's look at the main elements of the script. Starting with the ``__main__`` guard,
9296
use a parser to read the hyperparameters passed to the estimator when creating the training job.
9397
These hyperparameters are made available as arguments to our input script.

doc/overview.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -374,7 +374,7 @@ Here are examples of how to use Amazon FSx for Lustre as input for training:
374374
375375
file_system_input = FileSystemInput(file_system_id='fs-2',
376376
file_system_type='FSxLustre',
377-
directory_path='/fsx/tensorflow',
377+
directory_path='/<mount-id>/tensorflow',
378378
file_system_access_mode='ro')
379379
380380
# Start an Amazon SageMaker training job with FSx using the FileSystemInput class
@@ -394,7 +394,7 @@ Here are examples of how to use Amazon FSx for Lustre as input for training:
394394
395395
records = FileSystemRecordSet(file_system_id='fs-=2,
396396
file_system_type='FSxLustre',
397-
directory_path='/fsx/kmeans',
397+
directory_path='/<mount-id>/kmeans',
398398
num_records=784,
399399
feature_dim=784)
400400

src/sagemaker/_studio.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ def _append_project_tags(tags=None, working_dir=None):
4646
return tags
4747

4848
all_tags = tags or []
49+
additional_tags = [tag for tag in additional_tags if tag not in all_tags]
4950
all_tags.extend(additional_tags)
5051

5152
return all_tags

src/sagemaker/image_uri_config/xgboost.json

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,35 @@
183183
"us-west-2": "246618743249"
184184
},
185185
"repository": "sagemaker-xgboost"
186+
},
187+
"1.3-1": {
188+
"registries": {
189+
"af-south-1": "510948584623",
190+
"ap-east-1": "651117190479",
191+
"ap-northeast-1": "354813040037",
192+
"ap-northeast-2": "366743142698",
193+
"ap-south-1": "720646828776",
194+
"ap-southeast-1": "121021644041",
195+
"ap-southeast-2": "783357654285",
196+
"ca-central-1": "341280168497",
197+
"cn-north-1": "450853457545",
198+
"cn-northwest-1": "451049120500",
199+
"eu-central-1": "492215442770",
200+
"eu-north-1": "662702820516",
201+
"eu-west-1": "141502667606",
202+
"eu-west-2": "764974769150",
203+
"eu-west-3": "659782779980",
204+
"eu-south-1": "978288397137",
205+
"me-south-1": "801668240914",
206+
"sa-east-1": "737474898029",
207+
"us-east-1": "683313688378",
208+
"us-east-2": "257758044811",
209+
"us-gov-west-1": "414596584902",
210+
"us-iso-east-1": "833128469047",
211+
"us-west-1": "746614075791",
212+
"us-west-2": "246618743249"
213+
},
214+
"repository": "sagemaker-xgboost"
186215
}
187216
}
188217
}

src/sagemaker/processing.py

Lines changed: 26 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@
3333
from sagemaker.local import LocalSession
3434
from sagemaker.utils import base_name_from_image, get_config_value, name_from_base
3535
from sagemaker.session import Session
36-
from sagemaker.network import NetworkConfig # noqa: F401 # pylint: disable=unused-import
3736
from sagemaker.workflow.properties import Properties
3837
from sagemaker.workflow.parameters import Parameter
3938
from sagemaker.workflow.entities import Expression
@@ -227,14 +226,14 @@ def _normalize_args(
227226
"""
228227
self._current_job_name = self._generate_current_job_name(job_name=job_name)
229228

230-
inputs_with_code = self._include_code_in_inputs(inputs, code)
229+
inputs_with_code = self._include_code_in_inputs(inputs, code, kms_key)
231230
normalized_inputs = self._normalize_inputs(inputs_with_code, kms_key)
232231
normalized_outputs = self._normalize_outputs(outputs)
233232
self.arguments = arguments
234233

235234
return normalized_inputs, normalized_outputs
236235

237-
def _include_code_in_inputs(self, inputs, _code):
236+
def _include_code_in_inputs(self, inputs, _code, _kms_key):
238237
"""A no op in the base class to include code in the processing job inputs.
239238
240239
Args:
@@ -243,6 +242,8 @@ def _include_code_in_inputs(self, inputs, _code):
243242
:class:`~sagemaker.processing.ProcessingInput` objects.
244243
_code (str): This can be an S3 URI or a local path to a file with the framework
245244
script to run (default: None). A no op in the base class.
245+
kms_key (str): The ARN of the KMS key that is used to encrypt the
246+
user code file (default: None).
246247
247248
Returns:
248249
list[:class:`~sagemaker.processing.ProcessingInput`]: inputs
@@ -536,7 +537,7 @@ def run(
536537
if wait:
537538
self.latest_job.wait(logs=logs)
538539

539-
def _include_code_in_inputs(self, inputs, code):
540+
def _include_code_in_inputs(self, inputs, code, kms_key=None):
540541
"""Converts code to appropriate input and includes in input list.
541542
542543
Side effects include:
@@ -549,12 +550,14 @@ def _include_code_in_inputs(self, inputs, code):
549550
:class:`~sagemaker.processing.ProcessingInput` objects.
550551
code (str): This can be an S3 URI or a local path to a file with the framework
551552
script to run (default: None).
553+
kms_key (str): The ARN of the KMS key that is used to encrypt the
554+
user code file (default: None).
552555
553556
Returns:
554557
list[:class:`~sagemaker.processing.ProcessingInput`]: inputs together with the
555558
code as `ProcessingInput`.
556559
"""
557-
user_code_s3_uri = self._handle_user_code_url(code)
560+
user_code_s3_uri = self._handle_user_code_url(code, kms_key)
558561
user_script_name = self._get_user_code_name(code)
559562

560563
inputs_with_code = self._convert_code_and_add_to_inputs(inputs, user_code_s3_uri)
@@ -575,14 +578,16 @@ def _get_user_code_name(self, code):
575578
code_url = urlparse(code)
576579
return os.path.basename(code_url.path)
577580

578-
def _handle_user_code_url(self, code):
581+
def _handle_user_code_url(self, code, kms_key=None):
579582
"""Gets the S3 URL containing the user's code.
580583
581584
Inspects the scheme the customer passed in ("s3://" for code in S3, "file://" or nothing
582585
for absolute or local file paths. Uploads the code to S3 if the code is a local file.
583586
584587
Args:
585588
code (str): A URL to the customer's code.
589+
kms_key (str): The ARN of the KMS key that is used to encrypt the
590+
user code file (default: None).
586591
587592
Returns:
588593
str: The S3 URL to the customer's code.
@@ -611,7 +616,7 @@ def _handle_user_code_url(self, code):
611616
code
612617
)
613618
)
614-
user_code_s3_uri = self._upload_code(code_path)
619+
user_code_s3_uri = self._upload_code(code_path, kms_key)
615620
else:
616621
raise ValueError(
617622
"code {} url scheme {} is not recognized. Please pass a file path or S3 url".format(
@@ -620,11 +625,13 @@ def _handle_user_code_url(self, code):
620625
)
621626
return user_code_s3_uri
622627

623-
def _upload_code(self, code):
628+
def _upload_code(self, code, kms_key=None):
624629
"""Uploads a code file or directory specified as a string and returns the S3 URI.
625630
626631
Args:
627632
code (str): A file or directory to be uploaded to S3.
633+
kms_key (str): The ARN of the KMS key that is used to encrypt the
634+
user code file (default: None).
628635
629636
Returns:
630637
str: The S3 URI of the uploaded file or directory.
@@ -638,7 +645,10 @@ def _upload_code(self, code):
638645
self._CODE_CONTAINER_INPUT_NAME,
639646
)
640647
return s3.S3Uploader.upload(
641-
local_path=code, desired_s3_uri=desired_s3_uri, sagemaker_session=self.sagemaker_session
648+
local_path=code,
649+
desired_s3_uri=desired_s3_uri,
650+
kms_key=kms_key,
651+
sagemaker_session=self.sagemaker_session,
642652
)
643653

644654
def _convert_code_and_add_to_inputs(self, inputs, s3_uri):
@@ -674,7 +684,9 @@ def _set_entrypoint(self, command, user_script_name):
674684
"""
675685
user_script_location = str(
676686
pathlib.PurePosixPath(
677-
self._CODE_CONTAINER_BASE_PATH, self._CODE_CONTAINER_INPUT_NAME, user_script_name
687+
self._CODE_CONTAINER_BASE_PATH,
688+
self._CODE_CONTAINER_INPUT_NAME,
689+
user_script_name,
678690
)
679691
)
680692
self.entrypoint = command + [user_script_location]
@@ -1074,7 +1086,10 @@ def _to_request_dict(self):
10741086
"""Generates a request dictionary using the parameters provided to the class."""
10751087

10761088
# Create the request dictionary.
1077-
s3_input_request = {"InputName": self.input_name, "AppManaged": self.app_managed}
1089+
s3_input_request = {
1090+
"InputName": self.input_name,
1091+
"AppManaged": self.app_managed,
1092+
}
10781093

10791094
if self.s3_input:
10801095
# Check the compression type, then add it to the dictionary.

src/sagemaker/workflow/conditions.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -186,8 +186,8 @@ def to_request(self) -> RequestType:
186186
"""Get the request structure for workflow service calls."""
187187
return {
188188
"Type": self.condition_type.value,
189-
"Value": self.value.expr,
190-
"In": [primitive_or_expr(in_value) for in_value in self.in_values],
189+
"QueryValue": self.value.expr,
190+
"Values": [primitive_or_expr(in_value) for in_value in self.in_values],
191191
}
192192

193193

src/sagemaker/workflow/parameters.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ def to_request(self) -> RequestType:
8080
"Name": self.name,
8181
"Type": self.parameter_type.value,
8282
}
83-
if self.default_value:
83+
if self.default_value is not None:
8484
value["DefaultValue"] = self.default_value
8585
return value
8686

src/sagemaker/workflow/pipeline.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020

2121
import attr
2222
import botocore
23-
2423
from botocore.exceptions import ClientError
2524

2625
from sagemaker._studio import _append_project_tags
@@ -197,13 +196,15 @@ def delete(self) -> Dict[str, Any]:
197196
def start(
198197
self,
199198
parameters: Dict[str, Any] = None,
199+
execution_display_name: str = None,
200200
execution_description: str = None,
201201
):
202202
"""Starts a Pipeline execution in the Workflow service.
203203
204204
Args:
205205
parameters (List[Dict[str, str]]): A list of parameter dicts of the form
206206
{"Name": "string", "Value": "string"}.
207+
execution_display_name (str): The display name of the pipeline execution.
207208
execution_description (str): A description of the execution.
208209
209210
Returns:
@@ -220,11 +221,13 @@ def start(
220221
"This pipeline is not associated with a Pipeline in SageMaker. "
221222
"Please invoke create() first before attempting to invoke start()."
222223
)
224+
223225
kwargs = dict(PipelineName=self.name)
224226
update_args(
225227
kwargs,
226228
PipelineParameters=format_start_parameters(parameters),
227229
PipelineExecutionDescription=execution_description,
230+
PipelineExecutionDisplayName=execution_display_name,
228231
)
229232
response = self.sagemaker_session.sagemaker_client.start_pipeline_execution(**kwargs)
230233
return _PipelineExecution(

0 commit comments

Comments
 (0)