Skip to content

Commit 3b4ba23

Browse files
chuyang-denglaurenyu
authored andcommitted
doc: add AutoML README (#1158)
1 parent 88439c5 commit 3b4ba23

File tree

3 files changed

+88
-8
lines changed

3 files changed

+88
-8
lines changed

README.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ Table of Contents
6363
19. `Inference Pipelines <https://sagemaker.readthedocs.io/en/stable/overview.html#inference-pipelines>`__
6464
20. `Amazon SageMaker Operators for Kubernetes <#amazon-sagemaker-operators-for-kubernetes>`__
6565
21. `SageMaker Workflow <#sagemaker-workflow>`__
66+
22. `SageMaker Autopilot <#sagemaker-autopilot>`__
6667

6768

6869
Installing the SageMaker Python SDK
@@ -344,3 +345,15 @@ You can use Apache Airflow to author, schedule and monitor SageMaker workflow.
344345
For more information, see `SageMaker Workflow in Apache Airflow`_.
345346

346347
.. _SageMaker Workflow in Apache Airflow: https://sagemaker.readthedocs.io/en/stable/using_workflow.html
348+
349+
SageMaker Autopilot
350+
-------------------
351+
352+
Amazon SageMaker Autopilot is an automated machine learning solution (commonly referred to as "AutoML") for tabular
353+
datasets. It automatically trains and tunes the best machine learning models for classification or regression based
354+
on your data, and hosts a series of models on an Inference Pipeline.
355+
356+
For more information about SageMaker Autopilot, see `SageMaker Autopilot`_.
357+
358+
.. _SageMaker Autopilot: src/sagemaker/automl/README.rst
359+

src/sagemaker/automl/README.rst

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
SageMaker Autopilot
2+
===================
3+
4+
Amazon SageMaker Autopilot is an automated machine learning solution (commonly referred to as "AutoML") for tabular
5+
datasets. It automatically trains and tunes the best machine learning models for classification or regression based
6+
on your data, and hosts a series of models on an Inference Pipeline.
7+
8+
SageMaker AutoML Class
9+
~~~~~~~~~~~~~~~~~~~~~~
10+
11+
The SageMaker ``AutoML`` class is similar to a SageMaker ``Estimator`` where you define the attributes of an AutoML
12+
job and feed input data to start the job.
13+
14+
Here's a simple example of using the ``AutoML`` object:
15+
16+
.. code:: python
17+
18+
from sagemaker import AutoML
19+
20+
auto_ml = AutoML(
21+
role="sagemaker-execution-role",
22+
target_attribute_name="y",
23+
sagemaker_session=sagemaker_session,
24+
)
25+
auto_ml.fit(inputs=inputs)
26+
27+
28+
The above code starts an AutoML job (data processing, training, tuning) and outputs a maximum of 500 candidates by
29+
default. You can modify the number of output candidates by specifying ``max_candidates`` in the constructor. The AutoML
30+
job will figure out the problem type (BinaryClassification, MulticlassClassification, Regression), but you can also
31+
specify the problem type by setting ``problem_type`` in the constructor. Other configurable settings include security
32+
settings, time limits, job objectives, tags, etc.
33+
34+
After an AutoML job is done, there are a few things that you can do with the result.
35+
36+
#. Describe the AutoML job: ``describe_auto_ml_job()`` will give you an overview of the AutoML job, information
37+
includes job name, best candidate, input/output locations, problem type, objective metrics, etc.
38+
39+
#. Get the best candidate: ``best_candidate()`` allows you to get the best candidate of an AutoML job. You can view the
40+
best candidate's step jobs, inference containers and other information like objective metrics.
41+
42+
#. List all the candidates: ``list_candidates()`` gives you all the candidates (up to the maximum number) of an AutoML
43+
job. By calling this method, you can view and compare the candidates.
44+
45+
#. Deploy the best candidate (or any given candidate): ``deploy()`` by default will deploy the best candidate to an
46+
inference pipeline. But you can also specify a candidate to deploy through ``candidate`` parameter.
47+
48+
For more information about ``AutoML`` parameters, please refer to: https://sagemaker.readthedocs.io/en/stable/sagemaker.automl.html
49+
50+
SageMaker CandidateEstimator Class
51+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52+
53+
The SageMaker ``CandidateEstimator`` class converts a dictionary with AutoML candidate information to an object that
54+
allows you to re-run the candidate's step jobs.
55+
56+
The simplest re-run is to feed a new dataset but reuse all other configurations from the candidate:
57+
58+
.. code:: python
59+
60+
candidate_estimator = CandidateEstimator(candidate_dict)
61+
inputs = new_inputs
62+
candidate_estimator.fit(inputs=inputs)
63+
64+
If you want to have more control over the step jobs of the candidate, you can call ``get_steps()`` and construct
65+
training/tuning jobs by yourself.
66+
67+
For more information about ``CandidateEstimator`` parameters, please refer to: https://sagemaker.readthedocs.io/en/stable/sagemaker.candidate_estimator.html

tests/integ/auto_ml_utils.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -27,16 +27,16 @@
2727

2828
def create_auto_ml_job_if_not_exist(sagemaker_session):
2929
auto_ml_job_name = "python-sdk-integ-test-base-job"
30-
auto_ml = AutoML(
31-
role=ROLE,
32-
target_attribute_name=TARGET_ATTRIBUTE_NAME,
33-
sagemaker_session=sagemaker_session,
34-
max_candidates=3,
35-
)
3630

3731
try:
38-
auto_ml.describe_auto_ml_job(job_name=auto_ml_job_name)
32+
sagemaker_session.describe_auto_ml_job(job_name=auto_ml_job_name)
3933
except Exception as e: # noqa: F841
34+
auto_ml = AutoML(
35+
role=ROLE,
36+
target_attribute_name=TARGET_ATTRIBUTE_NAME,
37+
sagemaker_session=sagemaker_session,
38+
max_candidates=3,
39+
)
4040
inputs = sagemaker_session.upload_data(path=TRAINING_DATA, key_prefix=PREFIX + "/input")
4141
with timeout(minutes=AUTO_ML_DEFAULT_TIMEMOUT_MINUTES):
42-
auto_ml.fit(inputs, job_name=auto_ml_job_name)
42+
auto_ml.fit(inputs, job_name=auto_ml_job_name, wait=True)

0 commit comments

Comments
 (0)