doc: add AutoML README (#1158)

chuyang-deng · laurenyu · commit 3b4ba23a3463 · 2019-12-16T10:37:43.000-08:00
diff --git a/README.rst b/README.rst
@@ -63,6 +63,7 @@ Table of Contents
 19. `Inference Pipelines <https://sagemaker.readthedocs.io/en/stable/overview.html#inference-pipelines>`__
 20. `Amazon SageMaker Operators for Kubernetes <#amazon-sagemaker-operators-for-kubernetes>`__
 21. `SageMaker Workflow <#sagemaker-workflow>`__
+22. `SageMaker Autopilot <#sagemaker-autopilot>`__
 
 
 Installing the SageMaker Python SDK
@@ -344,3 +345,15 @@ You can use Apache Airflow to author, schedule and monitor SageMaker workflow.
 For more information, see `SageMaker Workflow in Apache Airflow`_.
 
 .. _SageMaker Workflow in Apache Airflow: https://sagemaker.readthedocs.io/en/stable/using_workflow.html
+
+SageMaker Autopilot
+-------------------
+
+Amazon SageMaker Autopilot is an automated machine learning solution (commonly referred to as "AutoML") for tabular
+datasets. It automatically trains and tunes the best machine learning models for classification or regression based
+on your data, and hosts a series of models on an Inference Pipeline.
+
+For more information about SageMaker Autopilot, see `SageMaker Autopilot`_.
+
+.. _SageMaker Autopilot: src/sagemaker/automl/README.rst
+
diff --git a/src/sagemaker/automl/README.rst b/src/sagemaker/automl/README.rst
@@ -0,0 +1,67 @@
+SageMaker Autopilot
+===================
+
+Amazon SageMaker Autopilot is an automated machine learning solution (commonly referred to as "AutoML") for tabular
+datasets. It automatically trains and tunes the best machine learning models for classification or regression based
+on your data, and hosts a series of models on an Inference Pipeline.
+
+SageMaker AutoML Class
+~~~~~~~~~~~~~~~~~~~~~~
+
+The SageMaker ``AutoML`` class is similar to a SageMaker ``Estimator`` where you define the attributes of an AutoML
+job and feed input data to start the job.
+
+Here's a simple example of using the ``AutoML`` object:
+
+.. code:: python
+
+    from sagemaker import AutoML
+
+    auto_ml = AutoML(
+        role="sagemaker-execution-role",
+        target_attribute_name="y",
+        sagemaker_session=sagemaker_session,
+    )
+    auto_ml.fit(inputs=inputs)
+
+
+The above code starts an AutoML job (data processing, training, tuning) and outputs a maximum of 500 candidates by
+default. You can modify the number of output candidates by specifying ``max_candidates`` in the constructor. The AutoML
+job will figure out the problem type (BinaryClassification, MulticlassClassification, Regression), but you can also
+specify the problem type by setting ``problem_type`` in the constructor. Other configurable settings include security
+settings, time limits, job objectives, tags, etc.
+
+After an AutoML job is done, there are a few things that you can do with the result.
+
+#. Describe the AutoML job: ``describe_auto_ml_job()`` will give you an overview of the AutoML job, information
+includes job name, best candidate, input/output locations, problem type, objective metrics, etc.
+
+#. Get the best candidate: ``best_candidate()`` allows you to get the best candidate of an AutoML job. You can view the
+best candidate's step jobs, inference containers and other information like objective metrics.
+
+#. List all the candidates: ``list_candidates()`` gives you all the candidates (up to the maximum number) of an AutoML
+job. By calling this method, you can view and compare the candidates.
+
+#. Deploy the best candidate (or any given candidate): ``deploy()`` by default will deploy the best candidate to an
+inference pipeline. But you can also specify a candidate to deploy through ``candidate`` parameter.
+
+For more information about ``AutoML`` parameters, please refer to: https://sagemaker.readthedocs.io/en/stable/sagemaker.automl.html
+
+SageMaker CandidateEstimator Class
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The SageMaker ``CandidateEstimator`` class converts a dictionary with AutoML candidate information to an object that
+allows you to re-run the candidate's step jobs.
+
+The simplest re-run is to feed a new dataset but reuse all other configurations from the candidate:
+
+.. code:: python
+
+    candidate_estimator = CandidateEstimator(candidate_dict)
+    inputs = new_inputs
+    candidate_estimator.fit(inputs=inputs)
+
+If you want to have more control over the step jobs of the candidate, you can call ``get_steps()`` and construct
+training/tuning jobs by yourself.
+
+For more information about ``CandidateEstimator`` parameters, please refer to: https://sagemaker.readthedocs.io/en/stable/sagemaker.candidate_estimator.html
diff --git a/tests/integ/auto_ml_utils.py b/tests/integ/auto_ml_utils.py
@@ -27,16 +27,16 @@
 
 def create_auto_ml_job_if_not_exist(sagemaker_session):
     auto_ml_job_name = "python-sdk-integ-test-base-job"
-    auto_ml = AutoML(
-        role=ROLE,
-        target_attribute_name=TARGET_ATTRIBUTE_NAME,
-        sagemaker_session=sagemaker_session,
-        max_candidates=3,
-    )
 
     try:
-        auto_ml.describe_auto_ml_job(job_name=auto_ml_job_name)
+        sagemaker_session.describe_auto_ml_job(job_name=auto_ml_job_name)
     except Exception as e:  # noqa: F841
+        auto_ml = AutoML(
+            role=ROLE,
+            target_attribute_name=TARGET_ATTRIBUTE_NAME,
+            sagemaker_session=sagemaker_session,
+            max_candidates=3,
+        )
         inputs = sagemaker_session.upload_data(path=TRAINING_DATA, key_prefix=PREFIX + "/input")
         with timeout(minutes=AUTO_ML_DEFAULT_TIMEMOUT_MINUTES):
-            auto_ml.fit(inputs, job_name=auto_ml_job_name)
+            auto_ml.fit(inputs, job_name=auto_ml_job_name, wait=True)