Skip to content

Commit 6cabd43

Browse files
Merge branch 'master' into deepar-kix
2 parents 49b6fbe + 0e9c10e commit 6cabd43

File tree

252 files changed

+17796
-1881
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

252 files changed

+17796
-1881
lines changed

CHANGELOG.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,142 @@
11
# Changelog
22

3+
## v2.104.0 (2022-08-17)
4+
5+
### Features
6+
7+
* local mode executor implementation
8+
* Pipelines local mode setup
9+
* Add PT 1.12 support
10+
* added _AnalysisConfigGenerator for clarify
11+
12+
### Bug Fixes and Other Changes
13+
14+
* yaml safe_load sagemaker config
15+
* pipelines local mode minor bug fixes
16+
* add local mode integ tests
17+
* implement local JsonGet function
18+
* Add Pipeline annotation in model base class and tensorflow estimator
19+
* Allow users to customize trial component display names for pipeline launched jobs
20+
* Update localmode code to decode urllib response as UTF8
21+
22+
### Documentation Changes
23+
24+
* New content for Pipelines local mode
25+
* Correct documentation error
26+
27+
## v2.103.0 (2022-08-05)
28+
29+
### Features
30+
31+
* AutoGluon 0.4.3 and 0.5.2 image_uris
32+
33+
### Bug Fixes and Other Changes
34+
35+
* Revert "change: add a check to prevent launching a modelparallel job on CPU only instances"
36+
* Add gpu capability to local
37+
* Link PyTorch 1.11 to 1.11.0
38+
39+
## v2.102.0 (2022-08-04)
40+
41+
### Features
42+
43+
* add warnings for xgboost specific rules in debugger rules
44+
* Add PyTorch DDP distribution support
45+
* Add test for profiler enablement with debugger_hook false
46+
47+
### Bug Fixes and Other Changes
48+
49+
* Two letter language code must be supported
50+
* add a check to prevent launching a modelparallel job on CPU only instances
51+
* Allow StepCollection added in ConditionStep to be depended on
52+
* Add PipelineVariable annotation in framework models
53+
* skip managed spot training mxnet nb
54+
55+
### Documentation Changes
56+
57+
* smdistributed libraries currency updates
58+
59+
## v2.101.1 (2022-07-28)
60+
61+
### Bug Fixes and Other Changes
62+
63+
* added more ml frameworks supported by SageMaker Workflows
64+
* test: Vspecinteg2
65+
* Add PipelineVariable annotation in amazon models
66+
67+
## v2.101.0 (2022-07-27)
68+
69+
### Features
70+
71+
* Algorithms region launch on CGK
72+
* enhance-bucket-override-support
73+
* infer framework and version
74+
* support clarify bias detection when facets not included
75+
* Add CGK region to frameworks by DLC
76+
77+
### Bug Fixes and Other Changes
78+
79+
* Make repack step output path align with model repack path
80+
* Support parameterized source code input for TrainingStep
81+
82+
### Documentation Changes
83+
84+
* heterogeneous cluster api doc fix
85+
* smdmp v1.10 release note
86+
87+
## v2.100.0 (2022-07-18)
88+
89+
### Features
90+
91+
* upgrade to support python 3.10
92+
* Add target_model to support multi-model endpoints
93+
* Added support for feature group schema change and feature parameters
94+
95+
### Bug Fixes and Other Changes
96+
97+
* enable model.register without 'inference' & 'transform' instances
98+
* rename RegisterModel inner steps to prevent duplicate step names
99+
* remove primitive_or_expr() from conditions
100+
* support pipeline variables for spark processors run arguments
101+
* make 'ModelInput' field optional for inference recommendation
102+
* Fix processing image uri param
103+
* fix: neo inferentia as compilation target not using framework ver
104+
105+
### Documentation Changes
106+
107+
* SageMaker model parallel library v1.10.0 documentation
108+
* add detail & links to clarify docstrings
109+
110+
## v2.99.0 (2022-07-08)
111+
112+
### Features
113+
114+
* heterogeneous cluster set up in distribution config
115+
* support heterogeneous cluster for training
116+
* include fields to work with inference recommender
117+
118+
### Bug Fixes and Other Changes
119+
120+
* Moving the newly added field instance_group to the end of method
121+
* image_uri does not need to be specified with instance_groups
122+
* Loosen version of attrs dependency
123+
* Add PipelineVariable annotation in estimatory, processing, tuner, transformer base classes
124+
* model table link
125+
126+
### Documentation Changes
127+
128+
* documentation for heterogeneous cluster
129+
130+
## v2.98.0 (2022-07-05)
131+
132+
### Features
133+
134+
* Adding deepar image
135+
136+
### Documentation Changes
137+
138+
* edit to clarify how to use inference.py
139+
3140
## v2.97.0 (2022-06-28)
4141

5142
### Deprecations and Removals

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.97.1.dev0
1+
2.104.1.dev0

doc/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
# You can set these variables from the command line.
55
SPHINXOPTS = -W
6-
SPHINXBUILD = python -msphinx
6+
SPHINXBUILD = python3 -msphinx
77
SPHINXPROJ = sagemaker
88
SOURCEDIR = .
99
BUILDDIR = _build

doc/algorithms/index.rst

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,15 @@
11
######################
2-
First-Party Algorithms
2+
Built-in Algorithms
33
######################
44

55
Amazon SageMaker provides implementations of some common machine learning algorithms optimized for GPU architecture and massive datasets.
66

77
.. toctree::
88
:maxdepth: 2
99

10-
sagemaker.amazon.amazon_estimator
11-
factorization_machines
12-
ipinsights
13-
kmeans
14-
knn
15-
lda
16-
linear_learner
17-
ntm
18-
object2vec
19-
pca
20-
randomcutforest
10+
tabular/index
11+
text/index
12+
time_series/index
13+
unsupervised/index
14+
vision/index
15+
other/index

doc/algorithms/other/index.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
######################
2+
Other
3+
######################
4+
5+
:ref:`All Pre-trained Models <all-pretrained-models>`
6+
7+
.. toctree::
8+
:maxdepth: 2
9+
10+
sagemaker.amazon.amazon_estimator

doc/algorithms/tabular/autogluon.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
############
2+
AutoGluon
3+
############
4+
5+
`AutoGluon-Tabular <https://auto.gluon.ai/stable/index.html>`__ is a popular open-source AutoML framework that trains highly accurate machine learning models on an unprocessed tabular dataset.
6+
Unlike existing AutoML frameworks that primarily focus on model and hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers.
7+
8+
9+
The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker AutoGluon-Tabular algorithm.
10+
11+
.. list-table::
12+
:widths: 25 25
13+
:header-rows: 1
14+
15+
* - Notebook Title
16+
- Description
17+
* - `Tabular classification with Amazon SageMaker AutoGluon-Tabular algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/autogluon_tabular/Amazon_Tabular_Classification_AutoGluon.ipynb>`__
18+
- This notebook demonstrates the use of the Amazon SageMaker AutoGluon-Tabular algorithm to train and host a tabular classification model.
19+
* - `Tabular regression with Amazon SageMaker AutoGluon-Tabular algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/autogluon_tabular/Amazon_Tabular_Regression_AutoGluon.ipynb>`__
20+
- This notebook demonstrates the use of the Amazon SageMaker AutoGluon-Tabular algorithm to train and host a tabular regression model.
21+
22+
23+
For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see
24+
`Use Amazon SageMaker Notebook Instances <https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html>`__. After you have created a notebook
25+
instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its
26+
Use tab and choose Create copy.
27+
28+
For detailed documentation, please refer to the `Sagemaker AutoGluon-Tabular Algorithm <https://docs.aws.amazon.com/sagemaker/latest/dg/autogluon-tabular.html>`__.

doc/algorithms/tabular/catboost.rst

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
############
2+
CatBoost
3+
############
4+
5+
6+
`CatBoost <https://catboost.ai/>`__ is a popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT)
7+
algorithm. GBDT is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of
8+
estimates from a set of simpler and weaker models.
9+
10+
CatBoost introduces two critical algorithmic advances to GBDT:
11+
12+
* The implementation of ordered boosting, a permutation-driven alternative to the classic algorithm
13+
14+
* An innovative algorithm for processing categorical features
15+
16+
Both techniques were created to fight a prediction shift caused by a special kind of target leakage present in all currently existing
17+
implementations of gradient boosting algorithms.
18+
19+
The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker CatBoost algorithm.
20+
21+
.. list-table::
22+
:widths: 25 25
23+
:header-rows: 1
24+
25+
* - Notebook Title
26+
- Description
27+
* - `Tabular classification with Amazon SageMaker LightGBM and CatBoost algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/lightgbm_catboost_tabular/Amazon_Tabular_Classification_LightGBM_CatBoost.ipynb>`__
28+
- This notebook demonstrates the use of the Amazon SageMaker CatBoost algorithm to train and host a tabular classification model.
29+
* - `Tabular regression with Amazon SageMaker LightGBM and CatBoost algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/lightgbm_catboost_tabular/Amazon_Tabular_Regression_LightGBM_CatBoost.ipynb>`__
30+
- This notebook demonstrates the use of the Amazon SageMaker CatBoost algorithm to train and host a tabular regression model.
31+
32+
For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see
33+
`Use Amazon SageMaker Notebook Instances <https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html>`__. After you have created a notebook
34+
instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its
35+
Use tab and choose Create copy.
36+
37+
For detailed documentation, please refer to the `Sagemaker CatBoost Algorithm <https://docs.aws.amazon.com/sagemaker/latest/dg/catboost.html>`__.

doc/algorithms/factorization_machines.rst renamed to doc/algorithms/tabular/factorization_machines.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FactorizationMachines
1+
Factorization Machines
22
-------------------------
33

44
The Amazon SageMaker Factorization Machines algorithm.

doc/algorithms/tabular/index.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
######################
2+
Tabular
3+
######################
4+
5+
Amazon SageMaker provides built-in algorithms that are tailored to the analysis of tabular data. The built-in SageMaker algorithms for tabular data can be used for either classification or regression problems.
6+
7+
.. toctree::
8+
:maxdepth: 2
9+
10+
autogluon
11+
catboost
12+
factorization_machines
13+
knn
14+
lightgbm
15+
linear_learner
16+
tabtransformer
17+
xgboost
18+
object2vec
File renamed without changes.

doc/algorithms/tabular/lightgbm.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
############
2+
LightGBM
3+
############
4+
5+
`LightGBM <https://lightgbm.readthedocs.io/en/latest/>`__ is a popular and efficient open-source implementation of the Gradient Boosting
6+
Decision Tree (GBDT) algorithm. GBDT is a supervised learning algorithm that attempts to accurately predict a target variable by
7+
combining an ensemble of estimates from a set of simpler and weaker models. LightGBM uses additional techniques to significantly improve
8+
the efficiency and scalability of conventional GBDT.
9+
10+
The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker LightGBM algorithm.
11+
12+
.. list-table::
13+
:widths: 25 25
14+
:header-rows: 1
15+
16+
* - Notebook Title
17+
- Description
18+
* - `Tabular classification with Amazon SageMaker LightGBM and CatBoost algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/lightgbm_catboost_tabular/Amazon_Tabular_Classification_LightGBM_CatBoost.ipynb>`__
19+
- This notebook demonstrates the use of the Amazon SageMaker LightGBM algorithm to train and host a tabular classification model.
20+
* - `Tabular regression with Amazon SageMaker LightGBM and CatBoost algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/lightgbm_catboost_tabular/Amazon_Tabular_Regression_LightGBM_CatBoost.ipynb>`__
21+
- This notebook demonstrates the use of the Amazon SageMaker LightGBM algorithm to train and host a tabular regression model.
22+
23+
For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see
24+
`Use Amazon SageMaker Notebook Instances <https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html>`__. After you have created a notebook
25+
instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its
26+
Use tab and choose Create copy.
27+
28+
For detailed documentation, please refer to the `Sagemaker LightGBM Algorithm <https://docs.aws.amazon.com/sagemaker/latest/dg/lightgbm.html>`__.
File renamed without changes.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
###############
2+
TabTransformer
3+
###############
4+
5+
`TabTransformer <https://arxiv.org/abs/2012.06678>`__ is a novel deep tabular data modeling architecture for supervised learning. The TabTransformer architecture is built on self-attention-based Transformers.
6+
The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy. Furthermore, the contextual embeddings learned from TabTransformer
7+
are highly robust against both missing and noisy data features, and provide better interpretability.
8+
9+
10+
The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker TabTransformer algorithm.
11+
12+
.. list-table::
13+
:widths: 25 25
14+
:header-rows: 1
15+
16+
* - Notebook Title
17+
- Description
18+
* - `Tabular classification with Amazon SageMaker TabTransformer algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/tabtransformer_tabular/Amazon_Tabular_Classification_TabTransformer.ipynb>`__
19+
- This notebook demonstrates the use of the Amazon SageMaker TabTransformer algorithm to train and host a tabular classification model.
20+
* - `Tabular regression with Amazon SageMaker TabTransformer algorithm <https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/tabtransformer_tabular/Amazon_Tabular_Regression_TabTransformer.ipynb>`__
21+
- This notebook demonstrates the use of the Amazon SageMaker TabTransformer algorithm to train and host a tabular regression model.
22+
23+
For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see
24+
`Use Amazon SageMaker Notebook Instances <https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html>`__. After you have created a notebook
25+
instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its
26+
Use tab and choose Create copy.
27+
28+
For detailed documentation, please refer to the `Sagemaker TabTransformer Algorithm <https://docs.aws.amazon.com/sagemaker/latest/dg/tabtransformer.html>`__.

doc/algorithms/tabular/xgboost.rst

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
############
2+
XGBoost
3+
############
4+
5+
The `XGBoost <https://github.com/dmlc/xgboost>`__ (eXtreme Gradient Boosting) is a popular and efficient open-source implementation of the gradient boosted trees algorithm. Gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable
6+
by combining an ensemble of estimates from a set of simpler and weaker models. The XGBoost algorithm performs well in machine learning competitions because of its robust handling of a variety of data types, relationships, distributions, and the variety of hyperparameters that you can
7+
fine-tune. You can use XGBoost for regression, classification (binary and multiclass), and ranking problems.
8+
9+
You can use the new release of the XGBoost algorithm either as a Amazon SageMaker built-in algorithm or as a framework to run training scripts in your local environments. This implementation has a smaller memory footprint, better logging, improved hyperparameter validation, and
10+
an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5.
11+
12+
The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm.
13+
14+
.. list-table::
15+
:widths: 25 25
16+
:header-rows: 1
17+
18+
* - Notebook Title
19+
- Description
20+
* - `How to Create a Custom XGBoost container? <https://sagemaker-examples.readthedocs.io/en/latest/aws_sagemaker_studio/sagemaker_studio_image_build/xgboost_bring_your_own/Batch_Transform_BYO_XGB.html>`__
21+
- This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform.
22+
* - `Regression with XGBoost using Parquet <https://sagemaker-examples.readthedocs.io/en/latest/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_parquet_input_training.html>`__
23+
- This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model.
24+
* - `How to Train and Host a Multiclass Classification Model? <https://sagemaker-examples.readthedocs.io/en/latest/introduction_to_amazon_algorithms/xgboost_mnist/xgboost_mnist.html>`__
25+
- This notebook shows how to use the MNIST dataset to train and host a multiclass classification model.
26+
* - `How to train a Model for Customer Churn Prediction? <https://sagemaker-examples.readthedocs.io/en/latest/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.html>`__
27+
- This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers.
28+
* - `An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training <https://sagemaker-examples.readthedocs.io/en/latest/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_managed_spot_training.html>`__
29+
- This notebook shows you how to use Spot Instances for training with a XGBoost Container.
30+
* - `How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? <https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-debugger/xgboost_builtin_rules/xgboost-regression-debugger-rules.html>`__
31+
- This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies.
32+
* - `How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? <https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-debugger/xgboost_realtime_analysis/xgboost-realtime-analysis.html>`__
33+
- This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running.
34+
35+
For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see
36+
`Use Amazon SageMaker Notebook Instances <https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html>`__. After you have created a notebook
37+
instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its
38+
Use tab and choose Create copy.
39+
40+
For detailed documentation, please refer to the `Sagemaker XGBoost Algorithm <https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html>`__.

0 commit comments

Comments
 (0)