Skip to content

Merge master into master-assets #4515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
7bf6abd
fix: make sure gpus are found in local_gpu run (#4384)
gverkes Mar 4, 2024
790bd87
feat: pin dll version to support python3.11 to the sdk (#4472)
akrishna1995 Mar 4, 2024
55293fb
fix: Skip No Canvas regions for test_deploy_best_candidate (#4477)
knikure Mar 5, 2024
84d965c
prepare release v2.211.0
Mar 5, 2024
f2f695a
update development version to v2.211.1.dev0
Mar 5, 2024
3aa0a56
change: Enhance model builder selection logic to include model size (…
samruds Mar 6, 2024
427c7ba
change: Upgrade smp to version 2.2 (#4479)
adtian2 Mar 6, 2024
68fb171
feat: Update SM Python SDK for PT 2.2.0 SM DLC (#4481)
sirutBuasai Mar 6, 2024
df2cc4d
fix: Create custom tarfile extractall util to fix backward compatibil…
knikure Mar 6, 2024
f82ceff
prepare release v2.212.0
Mar 6, 2024
907731d
update development version to v2.212.1.dev0
Mar 6, 2024
fcbd0bf
change: Update tblib constraint (#4452)
dbushy727 Mar 7, 2024
7fecc33
fix: make unit tests compatible with pytest-xdist (#4486)
benieric Mar 7, 2024
c622a73
feature: Add overriding logic in ModelBuilder when task is provided (…
xiongz945 Mar 8, 2024
615a8ad
feature: Accept user-defined env variables for the entry-point (#4175)
martinRenou Mar 8, 2024
d3a1825
fix: Move sagemaker pysdk version check after bootstrap in remote job…
qidewenwhen Mar 11, 2024
07e1b92
change: enable github actions for PRs (#4489)
benieric Mar 12, 2024
b51a613
feature: Add ModelDataSource and SourceUri support for model package …
mrudulmn Mar 12, 2024
8e400e9
feat: support JumpStart proprietary models (#4467)
Captainia Mar 12, 2024
064378d
chore: emit warning when no instance specific gated training env var …
evakravi Mar 13, 2024
377be87
fix: sagemaker session region not being used (#4469)
evakravi Mar 13, 2024
5828ad4
fix: add PT 2.2 support for smdistributed, pytorchddp, and torch_dist…
ruhanprasad Mar 13, 2024
1cdd446
change: split coverage out from testenv in tox.ini (#4495)
akrishna1995 Mar 13, 2024
d15a639
change: add ci-health checks (#4493)
benieric Mar 13, 2024
15a40ff
feat: tgi optimum 0.0.19, 0.0.20 releases (#4496)
evakravi Mar 13, 2024
fada4bf
feature: Add support for Streaming Inference (#4497)
mufaddal-rohawala Mar 14, 2024
8d22789
Add AutoML -> AutoMLV2 mapper (#4500)
repushko Mar 14, 2024
b9fbfbd
Skip of tests which are long running and causing the ResourceLimitInU…
repushko Mar 14, 2024
c2d5a23
Improvement of the tuner documentation (#4506)
repushko Mar 15, 2024
5678004
prepare release v2.213.0
Mar 15, 2024
09fe1c6
update development version to v2.213.1.dev0
Mar 15, 2024
5a7e99e
fix:urge customers to install latest version (#4507)
akrishna1995 Mar 15, 2024
434cba0
fix: list jumpstart models with invalid version strings (#4511)
Captainia Mar 18, 2024
c8d1428
fix: skip failing pt test (#4512)
benieric Mar 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions .github/workflows/codebuild-ci-health.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
name: CI Health
on:
schedule:
- cron: "0 */3 * * *"
workflow_dispatch:

permissions:
id-token: write # This is required for requesting the JWT

jobs:
codestyle-doc-tests:
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Codestyle & Doc Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-health-codestyle-doc-tests
unit-tests:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["py38", "py39", "py310"]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Unit Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-health-unit-tests
env-vars-for-codebuild: |
PY_VERSION
env:
PY_VERSION: ${{ matrix.python-version }}
integ-tests:
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Integ Tests
uses: aws-actions/aws-codebuild-run-build@v1
id: codebuild
with:
project-name: sagemaker-python-sdk-ci-health-integ-tests
slow-tests:
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Slow Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-health-slow-tests
localmode-tests:
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Local Mode Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-health-localmode-tests
notebook-tests:
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Notebook Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-health-notebook-tests
48 changes: 48 additions & 0 deletions .github/workflows/codebuild-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: PR Checks
on:
pull_request_target:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.head_ref }}
cancel-in-progress: true

permissions:
id-token: write # This is required for requesting the JWT

jobs:
codestyle-doc-tests:
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Codestyle & Doc Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-codestyle-doc-tests
source-version-override: 'pr/${{ github.event.pull_request.number }}'
unit-tests:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["py38", "py39", "py310"]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
aws-region: us-west-2
role-duration-seconds: 10800
- name: Run Unit Tests
uses: aws-actions/aws-codebuild-run-build@v1
with:
project-name: sagemaker-python-sdk-ci-unit-tests
source-version-override: 'pr/${{ github.event.pull_request.number }}'
env-vars-for-codebuild: |
PY_VERSION
env:
PY_VERSION: ${{ matrix.python-version }}
61 changes: 61 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,66 @@
# Changelog

## v2.213.0 (2024-03-15)

### Features

* Add support for Streaming Inference
* tgi optimum 0.0.19, 0.0.20 releases
* support JumpStart proprietary models
* Add ModelDataSource and SourceUri support for model package and while registering
* Accept user-defined env variables for the entry-point
* Add overriding logic in ModelBuilder when task is provided

### Bug Fixes and Other Changes

* Improvement of the tuner documentation
* Skip of tests which are long running and causing the ResourceLimitInUse exception
* Add AutoML -> AutoMLV2 mapper
* add ci-health checks
* split coverage out from testenv in tox.ini
* add PT 2.2 support for smdistributed, pytorchddp, and torch_distributed distributions
* sagemaker session region not being used
* chore: emit warning when no instance specific gated training env var is available, and raise exception when accept_eula flag is not supplied
* enable github actions for PRs
* Move sagemaker pysdk version check after bootstrap in remote job
* make unit tests compatible with pytest-xdist
* Update tblib constraint

## v2.212.0 (2024-03-06)

### Features

* Update SM Python SDK for PT 2.2.0 SM DLC

### Bug Fixes and Other Changes

* Create custom tarfile extractall util to fix backward compatibility issue
* Upgrade smp to version 2.2
* Enhance model builder selection logic to include model size

## v2.211.0 (2024-03-05)

### Features

* pin dll version to support python3.11 to the sdk
* instance specific jumpstart host requirements
* Add TensorFlow 2.14 image configs
* Add AutoMLV2 support
* Support selective pipeline execution between function step and regular step
* Add new Triton DLC URIs

### Bug Fixes and Other Changes

* Skip No Canvas regions for test_deploy_best_candidate
* make sure gpus are found in local_gpu run
* Bump Apache Airflow version to 2.8.2
* properly close sagemaker config file after loading config
* remove enable_network_isolation from the python doc

### Documentation Changes

* Add doc for new feature processor APIs and classes

## v2.210.0 (2024-02-28)

### Features
Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@ Before sending us a pull request, please ensure that:
1. Install coverage using `pip install .[test]`
1. cd into the sagemaker-python-sdk folder: `cd sagemaker-python-sdk` or `cd /environment/sagemaker-python-sdk`
1. Run the following tox command and verify that all code checks and unit tests pass: `tox tests/unit`

You can also run a single test with the following command: `tox -e py310 -- -s -vv <path_to_file><file_name>::<test_function_name>`
1. You can also run a single test with the following command: `tox -e py310 -- -s -vv <path_to_file><file_name>::<test_function_name>`
1. You can run coverage via runcvoerage env : `tox -e runcoverage -- tests/unit` or `tox -e py310 -- tests/unit --cov=sagemaker --cov-append --cov-report xml`
* Note that the coverage test will fail if you only run a single test, so make sure to surround the command with `export IGNORE_COVERAGE=-` and `unset IGNORE_COVERAGE`
* Example: `export IGNORE_COVERAGE=- ; tox -e py310 -- -s -vv tests/unit/test_estimator.py::test_sagemaker_model_s3_uri_invalid ; unset IGNORE_COVERAGE`

Expand Down
9 changes: 6 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ SageMaker Python SDK
:target: https://sagemaker.readthedocs.io/en/stable/
:alt: Documentation Status

.. image:: https://github.com/benieric/sagemaker-python-sdk/actions/workflows/codebuild-ci-health.yml/badge.svg
:target: https://github.com/benieric/sagemaker-python-sdk/actions/workflows/codebuild-ci-health.yml
:alt: CI Health

SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

With the SDK, you can train and deploy models using popular deep learning frameworks **Apache MXNet** and **TensorFlow**.
Expand Down Expand Up @@ -63,11 +67,10 @@ Table of Contents
Installing the SageMaker Python SDK
-----------------------------------

The SageMaker Python SDK is built to PyPI and can be installed with pip as follows:

The SageMaker Python SDK is built to PyPI and the latest version of the SageMaker Python SDK can be installed with pip as follows
::

pip install sagemaker
pip install sagemaker==<Latest version from pyPI from https://pypi.org/project/sagemaker/>

You can install from source by cloning this repository and running a pip install command in the root directory of the repository:

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.210.1.dev0
2.213.1.dev0
83 changes: 68 additions & 15 deletions doc/doc_utils/jumpstart_doc_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,12 @@ class Frameworks(str, Enum):

JUMPSTART_REGION = "eu-west-2"
SDK_MANIFEST_FILE = "models_manifest.json"
PROPRIETARY_SDK_MANIFEST_FILE = "proprietary-sdk-manifest.json"
JUMPSTART_BUCKET_BASE_URL = "https://jumpstart-cache-prod-{}.s3.{}.amazonaws.com".format(
JUMPSTART_REGION, JUMPSTART_REGION
)
PROPRIETARY_DOC_BUCKET = "https://jumpstart-cache-prod-us-west-2.s3.us-west-2.amazonaws.com"

TASK_MAP = {
Tasks.IC: ProblemTypes.IMAGE_CLASSIFICATION,
Tasks.IC_EMBEDDING: ProblemTypes.IMAGE_EMBEDDING,
Expand Down Expand Up @@ -152,18 +155,26 @@ class Frameworks(str, Enum):
}


def get_jumpstart_sdk_manifest():
url = "{}/{}".format(JUMPSTART_BUCKET_BASE_URL, SDK_MANIFEST_FILE)
def get_public_s3_json_object(url):
with request.urlopen(url) as f:
models_manifest = f.read().decode("utf-8")
return json.loads(models_manifest)


def get_jumpstart_sdk_spec(key):
url = "{}/{}".format(JUMPSTART_BUCKET_BASE_URL, key)
with request.urlopen(url) as f:
model_spec = f.read().decode("utf-8")
return json.loads(model_spec)
def get_jumpstart_sdk_manifest():
return get_public_s3_json_object(f"{JUMPSTART_BUCKET_BASE_URL}/{SDK_MANIFEST_FILE}")


def get_proprietary_sdk_manifest():
return get_public_s3_json_object(f"{PROPRIETARY_DOC_BUCKET}/{PROPRIETARY_SDK_MANIFEST_FILE}")


def get_jumpstart_sdk_spec(s3_key: str):
return get_public_s3_json_object(f"{JUMPSTART_BUCKET_BASE_URL}/{s3_key}")


def get_proprietary_sdk_spec(s3_key: str):
return get_public_s3_json_object(f"{PROPRIETARY_DOC_BUCKET}/{s3_key}")


def get_model_task(id):
Expand Down Expand Up @@ -196,6 +207,45 @@ def get_model_source(url):
return "Source"


def create_proprietary_model_table():
proprietary_content_intro = []
proprietary_content_intro.append("\n")
proprietary_content_intro.append(".. list-table:: Available Proprietary Models\n")
proprietary_content_intro.append(" :widths: 50 20 20 20 20\n")
proprietary_content_intro.append(" :header-rows: 1\n")
proprietary_content_intro.append(" :class: datatable\n")
proprietary_content_intro.append("\n")
proprietary_content_intro.append(" * - Model ID\n")
proprietary_content_intro.append(" - Fine Tunable?\n")
proprietary_content_intro.append(" - Supported Version\n")
proprietary_content_intro.append(" - Min SDK Version\n")
proprietary_content_intro.append(" - Source\n")

sdk_manifest = get_proprietary_sdk_manifest()
sdk_manifest_top_versions_for_models = {}

for model in sdk_manifest:
if model["model_id"] not in sdk_manifest_top_versions_for_models:
sdk_manifest_top_versions_for_models[model["model_id"]] = model
else:
if str(sdk_manifest_top_versions_for_models[model["model_id"]]["version"]) < str(
model["version"]
):
sdk_manifest_top_versions_for_models[model["model_id"]] = model

proprietary_content_entries = []
for model in sdk_manifest_top_versions_for_models.values():
model_spec = get_proprietary_sdk_spec(model["spec_key"])
proprietary_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
proprietary_content_entries.append(" - {}\n".format(False)) # TODO: support training
proprietary_content_entries.append(" - {}\n".format(model["version"]))
proprietary_content_entries.append(" - {}\n".format(model["min_version"]))
proprietary_content_entries.append(
" - `{} <{}>`__ |external-link|\n".format("Source", model_spec.get("url"))
)
return proprietary_content_intro + proprietary_content_entries + ["\n"]


def create_jumpstart_model_table():
sdk_manifest = get_jumpstart_sdk_manifest()
sdk_manifest_top_versions_for_models = {}
Expand Down Expand Up @@ -249,19 +299,19 @@ def create_jumpstart_model_table():
file_content_intro.append(" - Source\n")

dynamic_table_files = []
file_content_entries = []
open_weight_content_entries = []

for model in sdk_manifest_top_versions_for_models.values():
model_spec = get_jumpstart_sdk_spec(model["spec_key"])
model_task = get_model_task(model_spec["model_id"])
string_model_task = get_string_model_task(model_spec["model_id"])
model_source = get_model_source(model_spec["url"])
file_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
file_content_entries.append(" - {}\n".format(model_spec["training_supported"]))
file_content_entries.append(" - {}\n".format(model["version"]))
file_content_entries.append(" - {}\n".format(model["min_version"]))
file_content_entries.append(" - {}\n".format(model_task))
file_content_entries.append(
open_weight_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
open_weight_content_entries.append(" - {}\n".format(model_spec["training_supported"]))
open_weight_content_entries.append(" - {}\n".format(model["version"]))
open_weight_content_entries.append(" - {}\n".format(model["min_version"]))
open_weight_content_entries.append(" - {}\n".format(model_task))
open_weight_content_entries.append(
" - `{} <{}>`__ |external-link|\n".format(model_source, model_spec["url"])
)

Expand Down Expand Up @@ -299,7 +349,10 @@ def create_jumpstart_model_table():
f.writelines(file_content_single_entry)
f.close()

proprietary_content_entries = create_proprietary_model_table()

f = open("doc_utils/pretrainedmodels.rst", "a")
f.writelines(file_content_intro)
f.writelines(file_content_entries)
f.writelines(open_weight_content_entries)
f.writelines(proprietary_content_entries)
f.close()
1 change: 1 addition & 0 deletions doc/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ docutils==0.15.2
packaging==20.9
jinja2==3.1.3
schema==0.7.5
accelerate>=0.24.1,<=0.27.0
1 change: 1 addition & 0 deletions requirements/extras/huggingface_requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
accelerate>=0.24.1,<=0.27.0
1 change: 1 addition & 0 deletions requirements/extras/test_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,4 @@ tritonclient[http]<2.37.0
onnx==1.14.1
# tf2onnx==1.15.1
nbformat>=5.9,<6
accelerate>=0.24.1,<=0.27.0
Loading