Skip to content

Commit 77d7b50

Browse files
authored
Merge branch 'master-jumpstart-curated-hub' into curated_hub_tagris_copy
2 parents 6d5f599 + d820b28 commit 77d7b50

File tree

166 files changed

+14475
-3002
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

166 files changed

+14475
-3002
lines changed

.github/workflows/codebuild-ci.yml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
name: PR Checks
2+
on:
3+
pull_request_target:
4+
5+
concurrency:
6+
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.head_ref }}
7+
cancel-in-progress: true
8+
9+
permissions:
10+
id-token: write # This is required for requesting the JWT
11+
12+
jobs:
13+
codestyle-doc-tests:
14+
runs-on: ubuntu-latest
15+
steps:
16+
- name: Configure AWS Credentials
17+
uses: aws-actions/configure-aws-credentials@v4
18+
with:
19+
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
20+
aws-region: us-west-2
21+
role-duration-seconds: 10800
22+
- name: Run Codestyle & Doc Tests
23+
uses: aws-actions/aws-codebuild-run-build@v1
24+
with:
25+
project-name: sagemaker-python-sdk-ci-codestyle-doc-tests
26+
source-version-override: "pr/${{ github.event.pull_request.number }}"
27+
unit-tests:
28+
runs-on: ubuntu-latest
29+
strategy:
30+
fail-fast: false
31+
matrix:
32+
python-version: ["py38", "py39", "py310"]
33+
steps:
34+
- name: Configure AWS Credentials
35+
uses: aws-actions/configure-aws-credentials@v4
36+
with:
37+
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
38+
aws-region: us-west-2
39+
role-duration-seconds: 10800
40+
- name: Run Unit Tests
41+
uses: aws-actions/aws-codebuild-run-build@v1
42+
with:
43+
project-name: sagemaker-python-sdk-ci-unit-tests
44+
source-version-override: "pr/${{ github.event.pull_request.number }}"
45+
env-vars-for-codebuild: |
46+
PY_VERSION
47+
env:
48+
PY_VERSION: ${{ matrix.python-version }}

CHANGELOG.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,55 @@
11
# Changelog
22

3+
## v2.212.0 (2024-03-06)
4+
5+
### Features
6+
7+
- Update SM Python SDK for PT 2.2.0 SM DLC
8+
9+
### Bug Fixes and Other Changes
10+
11+
- Create custom tarfile extractall util to fix backward compatibility issue
12+
- Upgrade smp to version 2.2
13+
- Enhance model builder selection logic to include model size
14+
15+
## v2.211.0 (2024-03-05)
16+
17+
### Features
18+
19+
- pin dll version to support python3.11 to the sdk
20+
- instance specific jumpstart host requirements
21+
- Add TensorFlow 2.14 image configs
22+
- Add AutoMLV2 support
23+
- Support selective pipeline execution between function step and regular step
24+
- Add new Triton DLC URIs
25+
26+
### Bug Fixes and Other Changes
27+
28+
- Skip No Canvas regions for test_deploy_best_candidate
29+
- make sure gpus are found in local_gpu run
30+
- Bump Apache Airflow version to 2.8.2
31+
- properly close sagemaker config file after loading config
32+
- remove enable_network_isolation from the python doc
33+
34+
### Documentation Changes
35+
36+
- Add doc for new feature processor APIs and classes
37+
38+
## v2.210.0 (2024-02-28)
39+
40+
### Features
41+
42+
- Prepend SageMaker Studio App Type to boto3 User Agent string
43+
- TGI optimum 0.0.18 (general+llm)
44+
- TGI 1.4.2
45+
46+
### Bug Fixes and Other Changes
47+
48+
- tolerate vulnerable old model for integ test and temporarily skip test_list_jumpstart_models_script_filter
49+
- add missing regions to pytorch config
50+
- Add validation for sagemaker version on remote job
51+
- fixed implementation of fail_on_violation for transform with monitoring
52+
353
## v2.209.0 (2024-02-24)
454

555
### Features

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.209.1.dev0
1+
2.212.1.dev0

doc/api/prep_data/feature_store.rst

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Feature Definition
6060
:members:
6161
:show-inheritance:
6262

63+
6364
Inputs
6465
******
6566

@@ -181,9 +182,13 @@ Feature Processor Data Source
181182
:members:
182183
:show-inheritance:
183184

185+
.. autoclass:: sagemaker.feature_store.feature_processor.PySparkDataSource
186+
:members:
187+
:show-inheritance:
184188

185-
Feature Processor Scheduler
186-
***************************
189+
190+
Feature Processor Scheduler and Triggers
191+
****************************************
187192

188193
.. automethod:: sagemaker.feature_store.feature_processor.to_pipeline
189194

@@ -196,3 +201,12 @@ Feature Processor Scheduler
196201
.. automethod:: sagemaker.feature_store.feature_processor.describe
197202

198203
.. automethod:: sagemaker.feature_store.feature_processor.list_pipelines
204+
205+
.. automethod:: sagemaker.feature_store.feature_processor.put_trigger
206+
207+
.. automethod:: sagemaker.feature_store.feature_processor.enable_trigger
208+
209+
.. automethod:: sagemaker.feature_store.feature_processor.disable_trigger
210+
211+
.. automethod:: sagemaker.feature_store.feature_processor.delete_trigger
212+

doc/api/training/automlv2.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
AutoMLV2
2+
--------
3+
4+
.. automodule:: sagemaker.automl.automlv2
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:

doc/api/training/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Training APIs
88
algorithm
99
analytics
1010
automl
11+
automlv2
1112
debugger
1213
estimators
1314
tuner

doc/doc_utils/jumpstart_doc_utils.py

Lines changed: 68 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -74,9 +74,12 @@ class Frameworks(str, Enum):
7474

7575
JUMPSTART_REGION = "eu-west-2"
7676
SDK_MANIFEST_FILE = "models_manifest.json"
77+
PROPRIETARY_SDK_MANIFEST_FILE = "proprietary-sdk-manifest.json"
7778
JUMPSTART_BUCKET_BASE_URL = "https://jumpstart-cache-prod-{}.s3.{}.amazonaws.com".format(
7879
JUMPSTART_REGION, JUMPSTART_REGION
7980
)
81+
PROPRIETARY_DOC_BUCKET = "https://jumpstart-cache-prod-us-west-2.s3.us-west-2.amazonaws.com"
82+
8083
TASK_MAP = {
8184
Tasks.IC: ProblemTypes.IMAGE_CLASSIFICATION,
8285
Tasks.IC_EMBEDDING: ProblemTypes.IMAGE_EMBEDDING,
@@ -152,18 +155,26 @@ class Frameworks(str, Enum):
152155
}
153156

154157

155-
def get_jumpstart_sdk_manifest():
156-
url = "{}/{}".format(JUMPSTART_BUCKET_BASE_URL, SDK_MANIFEST_FILE)
158+
def get_public_s3_json_object(url):
157159
with request.urlopen(url) as f:
158160
models_manifest = f.read().decode("utf-8")
159161
return json.loads(models_manifest)
160162

161163

162-
def get_jumpstart_sdk_spec(key):
163-
url = "{}/{}".format(JUMPSTART_BUCKET_BASE_URL, key)
164-
with request.urlopen(url) as f:
165-
model_spec = f.read().decode("utf-8")
166-
return json.loads(model_spec)
164+
def get_jumpstart_sdk_manifest():
165+
return get_public_s3_json_object(f"{JUMPSTART_BUCKET_BASE_URL}/{SDK_MANIFEST_FILE}")
166+
167+
168+
def get_proprietary_sdk_manifest():
169+
return get_public_s3_json_object(f"{PROPRIETARY_DOC_BUCKET}/{PROPRIETARY_SDK_MANIFEST_FILE}")
170+
171+
172+
def get_jumpstart_sdk_spec(s3_key: str):
173+
return get_public_s3_json_object(f"{JUMPSTART_BUCKET_BASE_URL}/{s3_key}")
174+
175+
176+
def get_proprietary_sdk_spec(s3_key: str):
177+
return get_public_s3_json_object(f"{PROPRIETARY_DOC_BUCKET}/{s3_key}")
167178

168179

169180
def get_model_task(id):
@@ -196,6 +207,45 @@ def get_model_source(url):
196207
return "Source"
197208

198209

210+
def create_proprietary_model_table():
211+
proprietary_content_intro = []
212+
proprietary_content_intro.append("\n")
213+
proprietary_content_intro.append(".. list-table:: Available Proprietary Models\n")
214+
proprietary_content_intro.append(" :widths: 50 20 20 20 20\n")
215+
proprietary_content_intro.append(" :header-rows: 1\n")
216+
proprietary_content_intro.append(" :class: datatable\n")
217+
proprietary_content_intro.append("\n")
218+
proprietary_content_intro.append(" * - Model ID\n")
219+
proprietary_content_intro.append(" - Fine Tunable?\n")
220+
proprietary_content_intro.append(" - Supported Version\n")
221+
proprietary_content_intro.append(" - Min SDK Version\n")
222+
proprietary_content_intro.append(" - Source\n")
223+
224+
sdk_manifest = get_proprietary_sdk_manifest()
225+
sdk_manifest_top_versions_for_models = {}
226+
227+
for model in sdk_manifest:
228+
if model["model_id"] not in sdk_manifest_top_versions_for_models:
229+
sdk_manifest_top_versions_for_models[model["model_id"]] = model
230+
else:
231+
if str(sdk_manifest_top_versions_for_models[model["model_id"]]["version"]) < str(
232+
model["version"]
233+
):
234+
sdk_manifest_top_versions_for_models[model["model_id"]] = model
235+
236+
proprietary_content_entries = []
237+
for model in sdk_manifest_top_versions_for_models.values():
238+
model_spec = get_proprietary_sdk_spec(model["spec_key"])
239+
proprietary_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
240+
proprietary_content_entries.append(" - {}\n".format(False)) # TODO: support training
241+
proprietary_content_entries.append(" - {}\n".format(model["version"]))
242+
proprietary_content_entries.append(" - {}\n".format(model["min_version"]))
243+
proprietary_content_entries.append(
244+
" - `{} <{}>`__ |external-link|\n".format("Source", model_spec.get("url"))
245+
)
246+
return proprietary_content_intro + proprietary_content_entries + ["\n"]
247+
248+
199249
def create_jumpstart_model_table():
200250
sdk_manifest = get_jumpstart_sdk_manifest()
201251
sdk_manifest_top_versions_for_models = {}
@@ -249,19 +299,19 @@ def create_jumpstart_model_table():
249299
file_content_intro.append(" - Source\n")
250300

251301
dynamic_table_files = []
252-
file_content_entries = []
302+
open_weight_content_entries = []
253303

254304
for model in sdk_manifest_top_versions_for_models.values():
255305
model_spec = get_jumpstart_sdk_spec(model["spec_key"])
256306
model_task = get_model_task(model_spec["model_id"])
257307
string_model_task = get_string_model_task(model_spec["model_id"])
258308
model_source = get_model_source(model_spec["url"])
259-
file_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
260-
file_content_entries.append(" - {}\n".format(model_spec["training_supported"]))
261-
file_content_entries.append(" - {}\n".format(model["version"]))
262-
file_content_entries.append(" - {}\n".format(model["min_version"]))
263-
file_content_entries.append(" - {}\n".format(model_task))
264-
file_content_entries.append(
309+
open_weight_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
310+
open_weight_content_entries.append(" - {}\n".format(model_spec["training_supported"]))
311+
open_weight_content_entries.append(" - {}\n".format(model["version"]))
312+
open_weight_content_entries.append(" - {}\n".format(model["min_version"]))
313+
open_weight_content_entries.append(" - {}\n".format(model_task))
314+
open_weight_content_entries.append(
265315
" - `{} <{}>`__ |external-link|\n".format(model_source, model_spec["url"])
266316
)
267317

@@ -299,7 +349,10 @@ def create_jumpstart_model_table():
299349
f.writelines(file_content_single_entry)
300350
f.close()
301351

352+
proprietary_content_entries = create_proprietary_model_table()
353+
302354
f = open("doc_utils/pretrainedmodels.rst", "a")
303355
f.writelines(file_content_intro)
304-
f.writelines(file_content_entries)
356+
f.writelines(open_weight_content_entries)
357+
f.writelines(proprietary_content_entries)
305358
f.close()

doc/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ docutils==0.15.2
44
packaging==20.9
55
jinja2==3.1.3
66
schema==0.7.5
7+
accelerate>=0.24.1,<=0.27.0
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
accelerate>=0.24.1,<=0.27.0

requirements/extras/test_requirements.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ awslogs==0.14.0
1212
black==22.3.0
1313
stopit==1.1.2
1414
# Update tox.ini to have correct version of airflow constraints file
15-
apache-airflow==2.8.1
15+
apache-airflow==2.8.2
1616
apache-airflow-providers-amazon==7.2.1
1717
attrs>=23.1.0,<24
1818
fabric==2.6.0
@@ -39,3 +39,4 @@ tritonclient[http]<2.37.0
3939
onnx==1.14.1
4040
# tf2onnx==1.15.1
4141
nbformat>=5.9,<6
42+
accelerate>=0.24.1,<=0.27.0

setup.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def read_requirements(filename):
6363
"PyYAML~=6.0",
6464
"jsonschema",
6565
"platformdirs",
66-
"tblib>=1.7.0,<3",
66+
"tblib>=1.7.0,<4",
6767
"urllib3>=1.26.8,<3.0.0",
6868
"requests",
6969
"docker",
@@ -79,6 +79,7 @@ def read_requirements(filename):
7979
"feature-processor": read_requirements(
8080
"requirements/extras/feature-processor_requirements.txt"
8181
),
82+
"huggingface": read_requirements("requirements/extras/huggingface_requirements.txt"),
8283
}
8384
# Meta dependency groups
8485
extras["all"] = [item for group in extras.values() for item in group]

src/sagemaker/__init__.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,17 @@
6161

6262
from sagemaker.automl.automl import AutoML, AutoMLJob, AutoMLInput # noqa: F401
6363
from sagemaker.automl.candidate_estimator import CandidateEstimator, CandidateStep # noqa: F401
64+
from sagemaker.automl.automlv2 import ( # noqa: F401
65+
AutoMLV2,
66+
AutoMLJobV2,
67+
LocalAutoMLDataChannel,
68+
AutoMLDataChannel,
69+
AutoMLTimeSeriesForecastingConfig,
70+
AutoMLImageClassificationConfig,
71+
AutoMLTabularConfig,
72+
AutoMLTextClassificationConfig,
73+
AutoMLTextGenerationConfig,
74+
)
6475

6576
from sagemaker.debugger import ProfilerConfig, Profiler # noqa: F401
6677

src/sagemaker/accept_types.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
from sagemaker.jumpstart import artifacts, utils as jumpstart_utils
1818
from sagemaker.jumpstart.constants import DEFAULT_JUMPSTART_SAGEMAKER_SESSION
19+
from sagemaker.jumpstart.enums import JumpStartModelType
1920
from sagemaker.session import Session
2021

2122

@@ -80,6 +81,7 @@ def retrieve_default(
8081
tolerate_vulnerable_model: bool = False,
8182
tolerate_deprecated_model: bool = False,
8283
sagemaker_session: Session = DEFAULT_JUMPSTART_SAGEMAKER_SESSION,
84+
model_type: JumpStartModelType = JumpStartModelType.OPEN_WEIGHTS,
8385
) -> str:
8486
"""Retrieves the default accept type for the model matching the given arguments.
8587
@@ -122,4 +124,5 @@ def retrieve_default(
122124
tolerate_vulnerable_model=tolerate_vulnerable_model,
123125
tolerate_deprecated_model=tolerate_deprecated_model,
124126
sagemaker_session=sagemaker_session,
127+
model_type=model_type,
125128
)

0 commit comments

Comments
 (0)