Update Lambda Step Notebook with content table and few updates (#3298)

shreyapandit · panditshreya · web-flow · commit bdefae1619e8 · 2022-04-07T07:47:47.000-05:00
* updates to lambda step notebook

* minor formatting changes

* Change text based on feedback

Co-authored-by: Shreya &lt;shreya@shreyapandit.com&gt;
diff --git a/sagemaker-pipelines/tabular/lambda-step/sagemaker-pipelines-lambda-step.ipynb b/sagemaker-pipelines/tabular/lambda-step/sagemaker-pipelines-lambda-step.ipynb
@@ -4,16 +4,16 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### SageMaker Pipelines Lambda Step\n",
+    "# SageMaker Pipelines Lambda Step\n",
     "\n",
     "This notebook illustrates how a Lambda function can be run as a step in a SageMaker Pipeline. \n",
     "\n",
     "The steps in this pipeline include -\n",
-    "* Preprocessing the abalone dataset\n",
+    "* Preprocessing the Abalone dataset\n",
     "* Train an XGBoost Model\n",
     "* Evaluate the model performance\n",
     "* Create a model\n",
-    "* Deploy the model to a SageMaker Hosted Endpoint using a Lambda Function\n",
+    "* Deploy the model to a SageMaker Hosted Endpoint using a Lambda Function, through SageMaker Pipelines\n",
     "\n",
     "A step to register the model into a Model Registry can be added to the pipeline using the `RegisterModel` step."
    ]
@@ -22,11 +22,26 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Prerequisites\n",
+    "## Contents\n",
     "\n",
-    "The notebook execution role should have policies which enable the notebook to create a Lambda function. The Amazon managed policy `AmazonSageMakerPipelinesIntegrations` can be added to the notebook execution role. \n",
+    "1. [Prerequisites](#Prerequisites)\n",
+    "1. [Configuration Setup](#Configuration-Setup)\n",
+    "1. [Data Preparation](#Data-Preparation)\n",
+    "1. [Model Training and Evaluation](#Model-Training-and-Evaluation)\n",
+    "1. [Setting up Lambda](#Setting-up-Lambda)\n",
+    "1. [Execute the Pipeline](#Execute-the-Pipeline)\n",
+    "1. [Clean up resources](#Clean-up-resources)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Prerequisites\n",
+    "\n",
+    "The notebook execution role should have policies which enable the notebook to create a Lambda function. The Amazon managed policy `AmazonSageMakerPipelinesIntegrations` can be added to the notebook execution role to achieve the same effect\n",
     "\n",
-    "The policy description is -\n",
+    "The policy description is as follows:\n",
     "\n",
     "```\n",
     "\n",
@@ -80,14 +95,10 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "import sys\n",
-    "\n",
-    "!{sys.executable} -m pip install \"sagemaker>=2.51.0\""
+    "Lets start by importing necessary packages and installing the SageMaker Python SDK"
    ]
   },
   {
@@ -132,7 +143,31 @@
     "from sagemaker.workflow.condition_step import ConditionStep\n",
     "from sagemaker.workflow.functions import JsonGet\n",
     "\n",
-    "from sagemaker.lambda_helper import Lambda"
+    "from sagemaker.lambda_helper import Lambda\n",
+    "import sys"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!{sys.executable} -m pip install \"sagemaker>=2.51.0\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Configuration Setup"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's now configure the setup we need, which includes the session object from the SageMaker Python SDK, and neccessary configurations for the pipelines, such as object types, input and output buckets and so on."
    ]
   },
   {
@@ -188,9 +223,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Data Preparation\n",
+    "## Data Preparation\n",
     "\n",
-    "An SKLearn processor is used to prepare the dataset for the Hyperparameter Tuning job. Using the script `preprocess.py`, the dataset is featurized and split into train, test, and validation datasets. \n",
+    "A SKLearn processor is used to prepare the dataset for the Hyperparameter Tuning job. Using the script `preprocess.py`, the dataset is featurized and split into train, test, and validation datasets. \n",
     "\n",
     "The output of this step is used as the input to the TrainingStep"
    ]
@@ -363,9 +398,16 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Model Training\n",
+    "## Model Training and Evaluation\n",
     "\n",
-    "Train an XGBoost model with the output of the ProcessingStep."
+    "We will now train a XGBoost model using the SageMaker python SDK using the output of the ProcessingStep."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Training the Model"
    ]
   },
   {
@@ -429,7 +471,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Evaluate the model\n",
+    "#### Evaluating the model\n",
     "\n",
     "Use a processing job to evaluate the model from the TrainingStep. If the output of the evaluation is True, a model will be created and a Lambda will be invoked to deploy the model to a SageMaker Endpoint. "
    ]
@@ -553,7 +595,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Create the model\n",
+    "#### Creating the final model object\n",
     "\n",
     "The model is created and the name of the model is provided to the Lambda function for deployment. The `CreateModelStep` dynamically assigns a name to the model."
    ]
@@ -584,7 +626,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Create the Lambda Step\n",
+    "## Setting up Lambda\n",
     "\n",
     "When defining the LambdaStep, the SageMaker Lambda helper class provides helper functions for creating the Lambda function. Users can either use the `lambda_func` argument to provide the function ARN to an already deployed Lambda function OR use the `Lambda` class to create a Lambda function by providing a script, function name and role for the Lambda function. \n",
     "\n",
@@ -621,7 +663,7 @@
     "\n",
     "def lambda_handler(event, context):\n",
     "    \"\"\" \"\"\"\n",
-    "    sm_client = sagemaker.Session().sagemaker_client\n",
+    "    sm_client = boto3.client(\"sagemaker\")\n",
     "\n",
     "    # The name of the model created in the Pipeline CreateModelStep\n",
     "    model_name = event[\"model_name\"]\n",
@@ -657,9 +699,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### IAM Role\n",
+    "#### Setting up the custom IAM Role\n",
     "\n",
-    "The Lambda function needs an IAM role that will allow it to deploy a SageMaker Endpoint. The role ARN must be provided in the LambdaStep. \n",
+    "The Lambda function needs an IAM role that allows it to deploy a SageMaker Endpoint. The role ARN must be provided in the LambdaStep. \n",
     "\n",
     "The Lambda role should at minimum have policies to allow `sagemaker:CreateModel`, `sagemaker:CreateEndpointConfig`, `sagemaker:CreateEndpoint` in addition to the based Lambda execution policies. \n",
     "\n",
@@ -770,7 +812,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Execute the Pipeline"
+    "## Execute the Pipeline"
    ]
   },
   {
@@ -830,7 +872,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Cleaning up resources\n",
+    "## Clean up resources\n",
     "\n",
     "Running the following cell will delete the following resources created in this notebook -\n",
     "* SageMaker Model\n",