Skip to content

Commit afc07cd

Browse files
committed
More text added to the BYO example
1 parent 35789ab commit afc07cd

File tree

1 file changed

+48
-3
lines changed

1 file changed

+48
-3
lines changed

scikit_bring_your_own/scikit_bring_your_own.ipynb

Lines changed: 48 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,33 @@
88
"\n",
99
"With Amazon SageMaker, you can package your own algorithms that can than be trained and deployed in the SageMaker environment. This notebook will guide you through an example that shows you how to build a Docker container for SageMaker and use it for training and inference.\n",
1010
"\n",
11+
"By packaging an algorithm in a container, you can bring almost any code to the Amazon SageMaker environment, regardless of programming language, environment, framework, or dependencies. \n",
12+
"\n",
1113
"_TODO: Insert TOC here_\n",
1214
"\n",
1315
"## When should I build my own algorithm container?\n",
1416
"\n",
17+
"You may not need to create a container to bring your own code to Amazon SageMaker. When you are using a framework (such as Apache MXNet or TensorFlow) that has direct support in SageMaker, you can simply supply the Python code that implements your algorithm using the SDK entry points for that framework. This set of frameworks is continually expanding, so we recommend that you check the current list if your algorithm is written in a common machine learning environment.\n",
18+
"\n",
19+
"Even if there is direct SDK support for your environment or framework, you may find it more effective to build your own container. If the code that implements your algorithm is quite complex on its own or you need special additions to the framework, building your own container may be the right choice.\n",
20+
"\n",
21+
"If there isn't direct SDK support for your environment, don't worry. You'll see in this walk-through that building your own container is quite straightforward.\n",
22+
"\n",
1523
"## The example\n",
1624
"\n",
25+
"TODO: links\n",
26+
"\n",
27+
"Here, we'll show how to package a simple Python example which showcases the decision tree algorithm from the widely used scikit-learn machine learning package. The example is purposefully fairly trivial since the point is to show the surrounding structure that you'll want to add to your own code so you can train and host it in Amazon SageMaker.\n",
28+
"\n",
29+
"The ideas shown here will work in any language or environment. You'll need to choose the right tools for your environment to serve HTTP requests for inference, but good HTTP environments are available in every language these days.\n",
30+
"\n",
31+
"In this example, we use a single image to support training and hosting. This is easy because it means that we only need to manage one image and we can set it up to do everything. Sometimes you'll want separate images for training and hosting because they have different requirements. Just separate the parts discussed below into separate Dockerfiles and build two images. Choosing whether to have a single image or two images is really a matter of which is more convenient for you to develop and manage.\n",
32+
"\n",
33+
"If you're only using Amazon SageMaker for training or hosting, but not both, there is no need to build the unused functionality into your container.\n",
34+
"\n",
1735
"## The presentation\n",
1836
"\n",
19-
"This presentation is divided into two parts: building the container and using the container."
37+
"This presentation is divided into two parts: _building_ the container and _using_ the container."
2038
]
2139
},
2240
{
@@ -27,17 +45,44 @@
2745
"\n",
2846
"### An overview of Docker\n",
2947
"\n",
48+
"TODO: links to docker, docker run, and Dockerfile reference. and ECS\n",
49+
"TODO: check virtualenv spelling\n",
50+
"\n",
3051
"If you're familiar with Docker already, you can skip ahead to the next section.\n",
3152
"\n",
32-
"### How Amazon SageMaker runs your Docker container during training\n",
53+
"For many data scientists, Docker containers are a new concept, but they are not difficult, as you'll see here. \n",
54+
"\n",
55+
"Docker provides a simple way to package arbitrary code into an _image_ that is totally self-contained. Once you have an image, you can use Docker to run a _container_ based on that image. Running a container is just like running a program on the machine except that the container creates a fully self-contained environment for the program to run. Containers are isolated from each other and from the host environment, so the way you set up your program is the way it runs, no matter where you run it.\n",
56+
"\n",
57+
"Docker is more powerful than environment managers like conda or virtualenv because (a) it is completely language independent and (b) it comprises your whole operating environment, including startup commands, environment variable, etc.\n",
58+
"\n",
59+
"In some ways, a docker container is like a virtual machine, but it is much lighter weight. For example, a program running in a container can start in less than a second and many containers can run on the same physical machine or virtual machine instance.\n",
60+
"\n",
61+
"Docker uses a simple file called a `Dockerfile` to specify how the image is assembled. We'll see an example of that below. You can build your Docker images based on Docker images built by yourself or others, which can simplify things quite a bit.\n",
62+
"\n",
63+
"Docker has become very popular in the programming and devops communities for its flexibility and well-defined specification of the code to be run. It is the underpinning of many services built in the past few years, such as Amazon ECS.\n",
64+
"\n",
65+
"Amazon SageMaker uses Docker to allow users to train and deploy arbitrary algorithms.\n",
66+
"\n",
67+
"In Amazon SageMaker, Docker containers are invoked in a certain way for training and a slightly different way for hosting. The following sections outline how to build containers for the SageMaker environment.\n",
68+
"\n",
69+
"### How Amazon SageMaker runs your Docker container\n",
70+
"\n",
71+
"Because you can run the same image in training or hosting, Amazon SageMaker runs your container with the argument `train` or `serve`. How your container processes this argument depends on the container:\n",
72+
"\n",
73+
"* In the example here, we don't define an `ENTRYPOINT` in the Dockerfile so Docker will run the command `train` at training time and `serve` at serving time. In this example, we define these as executable Python scripts, but they could be any program that we want to start in that environment.\n",
74+
"* If you specify a program as an `ENTRYPOINT` in the Dockerfile, that program will be run at startup and its first argument will be `train` or `serve`. The program can then look at that argument and decide what to do.\n",
75+
"* If you are building separate containers for training and hosting (or building only for one or the other), you can define a program as an `ENTRYPOINT` in the Dockerfile and ignore (or verify) the first argument passed in. \n",
76+
"\n",
77+
"#### Running your container during training\n",
3378
"\n",
3479
"The container is run with the argument \"train\"\n",
3580
"\n",
3681
"The container gets some special files:\n",
3782
"\n",
3883
"TODO: Insert overview of file system here\n",
3984
"\n",
40-
"### How Amazon SageMaker runs your Docker container during hosting\n",
85+
"#### Running your container during hosting\n",
4186
"\n",
4287
"The container is run with the argument \"serve\". \n",
4388
"\n"

0 commit comments

Comments
 (0)