|
119 | 119 | "cell_type": "markdown",
|
120 | 120 | "metadata": {},
|
121 | 121 | "source": [
|
122 |
| - "# Run a Ground Truth labeling job\n", |
| 122 | + "## Run a Ground Truth labeling job\n", |
123 | 123 | "**This section should take about 3h to complete.**\n",
|
124 | 124 | "\n",
|
125 | 125 | "We will first run a labeling job. This involves several steps: collecting the images we want labeled, specifying the possible label categories, creating instructions, and writing a labeling job specification. In addition, we highly recommend to run a (free) mock job using a private workforce before you submit any job to the public workforce. This notebook will explain how to do that as an optional step. Without using a private workforce, this section until completion of your labeling job should take about 3h. However, this may vary depending on the availability of the public annotation workforce.\n",
|
|
786 | 786 | "cell_type": "markdown",
|
787 | 787 | "metadata": {},
|
788 | 788 | "source": [
|
789 |
| - "# Analyze Ground Truth labeling job results\n", |
| 789 | + "## Analyze Ground Truth labeling job results\n", |
790 | 790 | "**This section should take about 20min to complete.**\n",
|
791 | 791 | "\n",
|
792 | 792 | "After the job finishes running (**make sure `sagemaker_client.describe_labeling_job` shows the job is complete!**), it is time to analyze the results. The plots in the [Monitor job progress](#Monitor-job-progress) section form part of the analysis. In this section, we will gain additional insights into the results, all contained in the `output manifest`. You can find the location of the output manifest under `AWS Console > SageMaker > Labeling Jobs > [name of your job]`. We will obtain it programmatically in the cell below.\n",
|
|
1095 | 1095 | "cell_type": "markdown",
|
1096 | 1096 | "metadata": {},
|
1097 | 1097 | "source": [
|
1098 |
| - "# Compare Ground Truth results to known, pre-labeled data\n", |
| 1098 | + "## Compare Ground Truth results to known, pre-labeled data\n", |
1099 | 1099 | "**This section should take about 5 minutes to complete.**\n",
|
1100 | 1100 | "\n",
|
1101 | 1101 | "Sometimes (for example, when benchmarking the system) we have an alternative set of data labels available. \n",
|
|
1275 | 1275 | "cell_type": "markdown",
|
1276 | 1276 | "metadata": {},
|
1277 | 1277 | "source": [
|
1278 |
| - "# Train an image classifier using Ground Truth labels\n", |
| 1278 | + "## Train an image classifier using Ground Truth labels\n", |
1279 | 1279 | "At this stage, we have fully labeled our dataset and we can train a machine learning model to classify images based on the categories we previously defined. We'll do so using the **augmented manifest** output of our labeling job - no additional file translation or manipulation required! For a more complete description of the augmented manifest, see our other [example notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/ground_truth_labeling_jobs/object_detection_augmented_manifest_training/object_detection_augmented_manifest_training.ipynb).\n",
|
1280 | 1280 | "\n",
|
1281 | 1281 | "**NOTE:** Training neural networks to high accuracy often requires a careful choice of hyperparameters. In this case, we hand-picked hyperparameters that work reasonably well for this dataset. The neural net should have accuracy of about **60% if you're using 100 datapoints, and over 95% if you're using 1000 datapoints.**. To train neural networks on novel data, consider using [SageMaker's model tuning / hyperparameter optimization algorithms](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html).\n",
|
|
1434 | 1434 | "cell_type": "markdown",
|
1435 | 1435 | "metadata": {},
|
1436 | 1436 | "source": [
|
1437 |
| - "# Deploy the Model \n", |
| 1437 | + "## Deploy the Model \n", |
1438 | 1438 | "\n",
|
1439 | 1439 | "Now that we've fully labeled our dataset and have a trained model, we want to use the model to perform inference.\n",
|
1440 | 1440 | "\n",
|
|
1736 | 1736 | "cell_type": "markdown",
|
1737 | 1737 | "metadata": {},
|
1738 | 1738 | "source": [
|
1739 |
| - "# Review\n", |
| 1739 | + "## Review\n", |
1740 | 1740 | "\n",
|
1741 | 1741 | "We covered a lot of ground in this notebook! Let's recap what we accomplished. First we started with an unlabeled dataset (technically, the dataset was previously labeled by the authors of the dataset, but we discarded the original labels for the purposes of this demonstration). Next, we created a SageMake Ground Truth labeling job and generated new labels for all of the images in our dataset. Then we split this file into a training set and a validation set and trained a SageMaker image classification model. Finally, we created a hosted model endpoint and used it to make a live prediction for a held-out image in the original dataset."
|
1742 | 1742 | ]
|
|
0 commit comments