Skip to content

Commit d3a6c89

Browse files
xinyu7030Xinyu Liu
andauthored
add example notebook skeleton for fairness and explainability (#91)
Co-authored-by: Xinyu Liu <[email protected]>
1 parent 1556e09 commit d3a6c89

File tree

2 files changed

+614
-0
lines changed

2 files changed

+614
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,307 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"*This notebook serves as a rough skeleton for new example notebook additions, giving guidance on standards and best practices. It does not need to be adhered to exactly. The actual content of the notebook should determine the appropriate way to present that information, so deviate as needed. Indeed, there are several example notebooks that couldn't follow this template because they do not use some components of SageMaker's functionality (e.g. [SageMaker and Redshift](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/working_with_redshift_data/working_with_redshift_data.ipynb), [TensorFlow BYOM](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/tensorflow_iris_byom/tensorflow_BYOM_iris.ipynb), etc.). However, there are several elements which are mandatory for inclusion. These are marked explicitly below.*\n",
8+
"\n",
9+
"*NOTE: There are several best practices included in this notebook (Amazon copyright included in the notebook metadata, notebook opens in a SageMaker env (conda_python3 should be the default)). So, even if the content is expected to deviate drammatically, starting from this notebook will help you conform to standards.*"
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"metadata": {},
15+
"source": [
16+
"# Title\n",
17+
"_**Subtitle**_\n",
18+
"\n",
19+
"---\n",
20+
"\n",
21+
"---\n",
22+
"\n",
23+
"\n",
24+
"## Contents\n",
25+
"\n",
26+
"1. [Background](#Background)\n",
27+
"1. [Setup](#Setup)\n",
28+
"1. [Data](#Data)\n",
29+
"1. [Train](#Train)\n",
30+
"1. [Host](#Host)\n",
31+
" 1. [Evaluate](#Evaluate)\n",
32+
"1. [Extensions](#Extensions)\n",
33+
"\n",
34+
"---\n",
35+
"\n",
36+
"## Background\n",
37+
"\n",
38+
"_This section should contain several paragraphs describing the topic at a high level, data source, algorithms, and/or any specific libraries or engineering used._\n",
39+
"\n",
40+
"---\n",
41+
"\n",
42+
"## Setup\n",
43+
"\n",
44+
"_This notebook was created and tested on an ml.< instance type > notebook instance._\n",
45+
"\n",
46+
"Let's start by specifying:\n",
47+
"\n",
48+
"- The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the notebook instance, training, and hosting.\n",
49+
"- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace `sagemaker.get_execution_role()` with the appropriate full IAM role arn string(s)."
50+
]
51+
},
52+
{
53+
"cell_type": "code",
54+
"execution_count": 1,
55+
"metadata": {
56+
"collapsed": true,
57+
"isConfigCell": true
58+
},
59+
"outputs": [],
60+
"source": [
61+
"import sagemaker\n",
62+
"\n",
63+
"sess = sagemaker.Session()\n",
64+
"bucket = sess.default_bucket()\n",
65+
"base = 'DEMO-<notebook_name_here>' # notebook auther to input a short descriptive name\n",
66+
"prefix = 'sagemaker/' + base\n",
67+
"\n",
68+
"role = sagemaker.get_execution_role()"
69+
]
70+
},
71+
{
72+
"cell_type": "markdown",
73+
"metadata": {},
74+
"source": [
75+
"Now we'll import the Python libraries we'll need."
76+
]
77+
},
78+
{
79+
"cell_type": "code",
80+
"execution_count": 2,
81+
"metadata": {
82+
"collapsed": true
83+
},
84+
"outputs": [],
85+
"source": [
86+
"#import sagemaker\n",
87+
"#import boto3\n",
88+
"#import pandas as pd\n",
89+
"#import numpy as np\n",
90+
"#import matplotlib.pyplot as plt"
91+
]
92+
},
93+
{
94+
"cell_type": "markdown",
95+
"metadata": {},
96+
"source": [
97+
"---\n",
98+
"\n",
99+
"## Data\n",
100+
"\n",
101+
"*This section should describe the dataset at a high level. Typically the dataset will be downloaded with a `wget` or taken from S3. If so, make sure to have the dataset blessed ([example ticket](https://tt.amazon.com/0132482232)). Often times this requires we properly attribute the dataset source.*\n",
102+
"\n",
103+
"*Then we should open and explore the data briefly or deeply depending on the depth of understanding the customer will need to contextualize the remainder of the example notebook.*\n",
104+
"\n",
105+
"*Finally, if using SageMaker built-in algorithms from a protobuf dataset, please ensure conversion is in the appropriate format (and ideally uses the SageMaker Python SDK). Also ensure this final dataset is uploaded to the appropriate bucket + prefix S3 location.*"
106+
]
107+
},
108+
{
109+
"cell_type": "code",
110+
"execution_count": null,
111+
"metadata": {
112+
"collapsed": true
113+
},
114+
"outputs": [],
115+
"source": []
116+
},
117+
{
118+
"cell_type": "markdown",
119+
"metadata": {},
120+
"source": [
121+
"---\n",
122+
"\n",
123+
"## Train\n",
124+
"\n",
125+
"*This section mainly includes the transactional aspects of training your algorithm with SageMaker. All notes below may be handled to a greater or lesser extent depending on whether training is done using boto3 and the AWS SDK, a generic estimator from the Python SDK, or a custom estimator from the Python SDK.*\n",
126+
"\n",
127+
"*Regardless, the algorithm should train dynamically using the right container by region (in all regions), along the lines of:*"
128+
]
129+
},
130+
{
131+
"cell_type": "code",
132+
"execution_count": null,
133+
"metadata": {
134+
"collapsed": true
135+
},
136+
"outputs": [],
137+
"source": [
138+
"container = sagemaker.amazon.amazon_estimator.get_image_uri(boto3.Session().region_name, \n",
139+
" 'factorization-machines', \n",
140+
" 'latest')"
141+
]
142+
},
143+
{
144+
"cell_type": "markdown",
145+
"metadata": {},
146+
"source": [
147+
"*It should train on the appropriate instance type with a bias toward ml.m4.xlarge (since that's free tier eligible), and should have a reasonable value for MaxRuntimeInSeconds.*\n",
148+
"\n",
149+
"*Both input data and output locations should be set to the appropriate S3 bucket + prefix.*\n",
150+
"\n",
151+
"*Descriptions of hyperparameters, their usage (e.g. increasing this hyperparameter will lead to more X), potentially with a link to the documentation for further information is highly preferred.*\n",
152+
"\n",
153+
"*If using boto3 and the AWS SDK, make sure to use the appropriate try: finally: logic on the waiter.*"
154+
]
155+
},
156+
{
157+
"cell_type": "code",
158+
"execution_count": null,
159+
"metadata": {
160+
"collapsed": true
161+
},
162+
"outputs": [],
163+
"source": [
164+
"# Example logic for waiter\n",
165+
"client = boto3.client('sagemaker')\n",
166+
"\n",
167+
"client.create_training_job(**create_training_params)\n",
168+
"\n",
169+
"status = client.describe_training_job(TrainingJobName=job_name)['TrainingJobStatus']\n",
170+
"print(status)\n",
171+
"\n",
172+
"try:\n",
173+
" client.get_waiter('training_job_completed_or_stopped').wait(TrainingJobName=job_name)\n",
174+
"finally:\n",
175+
" status = client.describe_training_job(TrainingJobName=job_name)['TrainingJobStatus']\n",
176+
" print(\"Training job ended with status: \" + status)\n",
177+
" if status == 'Failed':\n",
178+
" message = client.describe_training_job(TrainingJobName=job_name)['FailureReason']\n",
179+
" print('Training failed with the following error: {}'.format(message))\n",
180+
" raise Exception('Training job failed')"
181+
]
182+
},
183+
{
184+
"cell_type": "markdown",
185+
"metadata": {},
186+
"source": [
187+
"---\n",
188+
"\n",
189+
"## Host\n",
190+
"\n",
191+
"*This section mainly includes the transaction aspects of hosting your algorithm with SageMaker. Notes below may be handled to a greater or lesser extent depending on whether hosting is done using boto3 and the AWS SDK or the Python SDK.*\n",
192+
"\n",
193+
"*The algorithm needs to host using the right container by region, similar to training above.*\n",
194+
"\n",
195+
"*Unless there is an explicit reason not to, all hosting should be done on a single ml.m4.xlarge instance as they are eligible for the free tier.*\n",
196+
"\n",
197+
"*Endpoints should always be named with a prefix of \"DEMO\" so that they can be properly distinguished from production endpoints.*\n",
198+
"\n",
199+
"*If using boto3 and the AWS SDK, make sure to use the appropriate try: finally: logic in the waiter.*"
200+
]
201+
},
202+
{
203+
"cell_type": "code",
204+
"execution_count": null,
205+
"metadata": {
206+
"collapsed": true
207+
},
208+
"outputs": [],
209+
"source": [
210+
"# Example logic for waiter\n",
211+
"endpoint_name = 'DEMO-' + strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())\n",
212+
"print(endpoint_name)\n",
213+
"create_endpoint_response = client.create_endpoint(\n",
214+
" EndpointName=endpoint_name,\n",
215+
" EndpointConfigName=endpoint_config_name)\n",
216+
"print(create_endpoint_response['EndpointArn'])\n",
217+
"\n",
218+
"resp = client.describe_endpoint(EndpointName=endpoint_name)\n",
219+
"status = resp['EndpointStatus']\n",
220+
"print(\"Status: \" + status)\n",
221+
"\n",
222+
"try:\n",
223+
" client.get_waiter('endpoint_in_service').wait(EndpointName=endpoint_name)\n",
224+
"finally:\n",
225+
" resp = client.describe_endpoint(EndpointName=endpoint_name)\n",
226+
" status = resp['EndpointStatus']\n",
227+
" print(\"Arn: \" + resp['EndpointArn'])\n",
228+
" print(\"Create endpoint ended with status: \" + status)\n",
229+
"\n",
230+
" if status != 'InService':\n",
231+
" message = client.describe_endpoint(EndpointName=endpoint_name)['FailureReason']\n",
232+
" print('Training failed with the following error: {}'.format(message))\n",
233+
" raise Exception('Endpoint creation did not succeed')"
234+
]
235+
},
236+
{
237+
"cell_type": "markdown",
238+
"metadata": {},
239+
"source": [
240+
"### Evaluate\n",
241+
"\n",
242+
"*This section should included details on how to serialize data in preparation for invoking the endpoint, deserializing the response, and understanding the model's outputs.*"
243+
]
244+
},
245+
{
246+
"cell_type": "code",
247+
"execution_count": null,
248+
"metadata": {
249+
"collapsed": true
250+
},
251+
"outputs": [],
252+
"source": []
253+
},
254+
{
255+
"cell_type": "markdown",
256+
"metadata": {},
257+
"source": [
258+
"---\n",
259+
"\n",
260+
"## Extensions\n",
261+
"\n",
262+
"*This section can be a simple paragraph on what could be done to improve upon or extend the existing example. Examples would be to spend more time tuning hyperparameters, scale to a larger dataset, go through the next example notebook in a sequence, etc.*\n",
263+
"\n",
264+
"### (Optional) Clean-up\n",
265+
"\n",
266+
"*At the end of the example notebook make sure to have a cell that deletes the endpoint(s) you created. Optionally, you may delete the S3 data, models, endpoints, or any other artifacts created during the example.*\n",
267+
"\n",
268+
"_Prior to publishing the notebook, we typically clear all cell outputs using Cell -> All Output -> Clear in the Jupyter menu above._\n",
269+
"\n",
270+
"*Confirm that the notebook runs end-to-end in a SageMaker Notebook Instance in multiple regions. Review with your SDK and PM in advance if you have any questions. Otherwise, submit a pull request to the [GitHub repo](https://github.com/awslabs/amazon-sagemaker-examples).*"
271+
]
272+
},
273+
{
274+
"cell_type": "code",
275+
"execution_count": null,
276+
"metadata": {
277+
"collapsed": true
278+
},
279+
"outputs": [],
280+
"source": [
281+
"sm.delete_endpoint(EndpointName=endpoint_name)"
282+
]
283+
}
284+
],
285+
"metadata": {
286+
"kernelspec": {
287+
"display_name": "Python 3",
288+
"language": "python",
289+
"name": "python3"
290+
},
291+
"language_info": {
292+
"codemirror_mode": {
293+
"name": "ipython",
294+
"version": 3
295+
},
296+
"file_extension": ".py",
297+
"mimetype": "text/x-python",
298+
"name": "python",
299+
"nbconvert_exporter": "python",
300+
"pygments_lexer": "ipython3",
301+
"version": "3.7.6"
302+
},
303+
"notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
304+
},
305+
"nbformat": 4,
306+
"nbformat_minor": 2
307+
}

0 commit comments

Comments
 (0)