Skip to content

Commit 210f9f1

Browse files
committed
Update Sphinx doc
1 parent 442b267 commit 210f9f1

File tree

2 files changed

+56
-24
lines changed

2 files changed

+56
-24
lines changed

doc/using_mxnet.rst

Lines changed: 55 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -532,29 +532,39 @@ For more information on how to enable MXNet to interact with Amazon Elastic Infe
532532
Model serving
533533
^^^^^^^^^^^^^
534534

535-
After the SageMaker model server has loaded your model, by calling either the default ``model_fn`` or the implementation in your training script, SageMaker will serve your model. Model serving is the process of responding to inference requests, received by SageMaker InvokeEndpoint API calls. The SageMaker MXNet model server breaks request handling into three steps:
535+
After the SageMaker model server has loaded your model, by calling either the default ``model_fn`` or the implementation in your script, SageMaker will serve your model.
536+
Model serving is the process of responding to inference requests, received by SageMaker InvokeEndpoint API calls.
537+
Defining how to handle these requests can be done in one of two ways:
536538

539+
- using ``input_fn``, ``predict_fn``, and ``output_fn``, some of which may be your own implementations
540+
- writing your own ``transform_fn`` for handling input processing, prediction, and output processing
537541

538-
- input processing,
539-
- prediction, and
540-
- output processing.
542+
Using ``input_fn``, ``predict_fn``, and ``output_fn``
543+
'''''''''''''''''''''''''''''''''''''''''''''''''''''
541544

542-
In a similar way to previous steps, you configure these steps by defining functions in your Python source file.
545+
The SageMaker MXNet model server breaks request handling into three steps:
543546

544-
Each step involves invoking a python function, with information about the request and the return-value from the previous function in the chain. Inside the SageMaker MXNet model server, the process looks like:
547+
- input processing
548+
- prediction
549+
- output processing
550+
551+
Just like with ``model_fn``, you configure these steps by defining functions in your Python source file.
552+
553+
Each step has its own Python function, which takes in information about the request and the return value from the previous function in the chain.
554+
Inside the SageMaker MXNet model server, the process looks like:
545555

546556
.. code:: python
547557
548558
# Deserialize the Invoke request body into an object we can perform prediction on
549-
input_object = input_fn(request_body, request_content_type, model)
559+
input_object = input_fn(request_body, request_content_type)
550560
551561
# Perform prediction on the deserialized object, with the loaded model
552562
prediction = predict_fn(input_object, model)
553563
554564
# Serialize the prediction result into the desired response content type
555565
ouput = output_fn(prediction, response_content_type)
556566
557-
The above code-sample shows the three function definitions:
567+
The above code sample shows the three function definitions that correlate to the three steps mentioned above:
558568

559569
- ``input_fn``: Takes request data and deserializes the data into an
560570
object for prediction.
@@ -563,7 +573,10 @@ The above code-sample shows the three function definitions:
563573
- ``output_fn``: Takes the result of prediction and serializes this
564574
according to the response content type.
565575

566-
The SageMaker MXNet model server provides default implementations of these functions. These work with common-content types, and Gluon API and Module API model objects. You can provide your own implementations for these functions in your training script. If you omit any definition then the SageMaker MXNet model server will use its default implementation for that function.
576+
The SageMaker MXNet model server provides default implementations of these functions.
577+
These work with common content types, and Gluon API and Module API model objects.
578+
You can also provide your own implementations for these functions in your training script.
579+
If you omit any definition then the SageMaker MXNet model server will use its default implementation for that function.
567580

568581
If you rely solely on the SageMaker MXNet model server defaults, you get the following functionality:
569582

@@ -575,36 +588,36 @@ If you rely solely on the SageMaker MXNet model server defaults, you get the fol
575588
In the following sections we describe the default implementations of input_fn, predict_fn, and output_fn. We describe the input arguments and expected return types of each, so you can define your own implementations.
576589

577590
Input processing
578-
''''''''''''''''
591+
""""""""""""""""
579592

580593
When an InvokeEndpoint operation is made against an Endpoint running a SageMaker MXNet model server, the model server receives two pieces of information:
581594

582-
- The request Content-Type, for example "application/json"
583-
- The request data body, a byte array
595+
- The request's content type, for example "application/json"
596+
- The request data body as a byte array
584597

585-
The SageMaker MXNet model server will invoke an ``input_fn`` function in your training script, passing in this information. If you define an ``input_fn`` function definition, it should return an object that can be passed to ``predict_fn`` and have the following signature:
598+
The SageMaker MXNet model server will invoke ``input_fn``, passing in this information. If you define an ``input_fn`` function definition, it should return an object that can be passed to ``predict_fn`` and have the following signature:
586599

587600
.. code:: python
588601
589-
def input_fn(request_body, request_content_type, model)
602+
def input_fn(request_body, request_content_type)
590603
591-
Where ``request_body`` is a byte buffer, ``request_content_type`` is a Python string, and model is the result of invoking ``model_fn``.
604+
Where ``request_body`` is a byte buffer and ``request_content_type`` is the content type of the request.
592605

593606
The SageMaker MXNet model server provides a default implementation of ``input_fn``. This function deserializes JSON or CSV encoded data into an MXNet ``NDArrayIter`` `(external API docs) <https://mxnet.incubator.apache.org/api/python/io.html#mxnet.io.NDArrayIter>`__ multi-dimensional array iterator. This works with the default ``predict_fn`` implementation, which expects an ``NDArrayIter`` as input.
594607

595-
Default json deserialization requires ``request_body`` contain a single json list. Sending multiple json objects within the same ``request_body`` is not supported. The list must have a dimensionality compatible with the MXNet ``net`` or ``Module`` object. Specifically, after the list is loaded, it's either padded or split to fit the first dimension of the model input shape. The list's shape must be identical to the model's input shape, for all dimensions after the first.
608+
Default JSON deserialization requires ``request_body`` contain a single json list. Sending multiple json objects within the same ``request_body`` is not supported. The list must have a dimensionality compatible with the MXNet ``net`` or ``Module`` object. Specifically, after the list is loaded, it's either padded or split to fit the first dimension of the model input shape. The list's shape must be identical to the model's input shape, for all dimensions after the first.
596609

597-
Default csv deserialization requires ``request_body`` contain one or more lines of CSV numerical data. The data is loaded into a two-dimensional array, where each line break defines the boundaries of the first dimension. This two-dimensional array is then re-shaped to be compatible with the shape expected by the model object. Specifically, the first dimension is kept unchanged, but the second dimension is reshaped to be consistent with the shape of all dimensions in the model, following the first dimension.
610+
Default CSV deserialization requires ``request_body`` contain one or more lines of CSV numerical data. The data is loaded into a two-dimensional array, where each line break defines the boundaries of the first dimension. This two-dimensional array is then re-shaped to be compatible with the shape expected by the model object. Specifically, the first dimension is kept unchanged, but the second dimension is reshaped to be consistent with the shape of all dimensions in the model, following the first dimension.
598611

599612
If you provide your own implementation of input_fn, you should abide by the ``input_fn`` signature. If you want to use this with the default
600-
``predict_fn``, then you should return an NDArrayIter. The NDArrayIter should have a shape identical to the shape of the model being predicted on. The example below shows a custom ``input_fn`` for preparing pickled numpy arrays.
613+
``predict_fn``, then you should return an ``NDArrayIter``. The ``NDArrayIter`` should have a shape identical to the shape of the model being predicted on. The example below shows a custom ``input_fn`` for preparing pickled numpy arrays.
601614

602615
.. code:: python
603616
604617
import numpy as np
605618
import mxnet as mx
606619
607-
def input_fn(request_body, request_content_type, model):
620+
def input_fn(request_body, request_content_type):
608621
"""An input_fn that loads a pickled numpy array"""
609622
if request_content_type == 'application/python-pickle':
610623
array = np.load(StringIO(request_body))
@@ -616,7 +629,7 @@ If you provide your own implementation of input_fn, you should abide by the ``in
616629
pass
617630
618631
Prediction
619-
''''''''''
632+
""""""""""
620633

621634
After the inference request has been deserialized by ``input_fn``, the SageMaker MXNet model server invokes ``predict_fn``. As with ``input_fn``, you can define your own ``predict_fn`` or use the SageMaker Mxnet default.
622635

@@ -649,9 +662,9 @@ If you implement your own prediction function, you should take care to ensure th
649662
``output_fn``, this should be an ``NDArrayIter``.
650663

651664
Output processing
652-
'''''''''''''''''
665+
"""""""""""""""""
653666

654-
After invoking ``predict_fn``, the model server invokes ``output_fn``, passing in the return-value from ``predict_fn`` and the InvokeEndpoint requested response content-type.
667+
After invoking ``predict_fn``, the model server invokes ``output_fn``, passing in the return value from ``predict_fn`` and the InvokeEndpoint requested response content type.
655668

656669
The ``output_fn`` has the following signature:
657670

@@ -660,10 +673,29 @@ The ``output_fn`` has the following signature:
660673
def output_fn(prediction, content_type)
661674
662675
Where ``prediction`` is the result of invoking ``predict_fn`` and
663-
``content_type`` is the InvokeEndpoint requested response content-type. The function should return a byte array of data serialized to content_type.
676+
``content_type`` is the InvokeEndpoint requested response content type. The function should return a byte array of data serialized to the expected content type.
664677

665678
The default implementation expects ``prediction`` to be an ``NDArray`` and can serialize the result to either JSON or CSV. It accepts response content types of "application/json" and "text/csv".
666679

680+
Using ``transform_fn``
681+
''''''''''''''''''''''
682+
683+
If you would rather not structure your code around the three methods described above, you can also define your own ``transform_fn`` to handle inference requests. This function has the following signature:
684+
685+
.. code:: python
686+
687+
def transform_fn(model, request_body, content_type, accept_type)
688+
689+
Where ``model`` is the model objected loaded by ``model_fn``, ``request_body`` is the data from the inference request, ``content_type`` is the content type of the request, and ``accept_type`` is the request content type for the response.
690+
691+
This one function should handle processing the input, performing a prediction, and processing the output.
692+
The return object should be one of the following:
693+
694+
- a tuple with two items: the response data and ``accept_type`` (the content type of the response data), or
695+
- a Flask response object: http://flask.pocoo.org/docs/1.0/api/#response-objects
696+
697+
You can find examples of hosting scripts using this structure in the example notebooks, such as the `mxnet_gluon_sentiment <https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/mxnet_gluon_sentiment/sentiment.py#L344-L387>`__ notebook.
698+
667699
Working with existing model data and training jobs
668700
--------------------------------------------------
669701

src/sagemaker/mxnet/README.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -818,4 +818,4 @@ The Docker images extend Ubuntu 16.04.
818818
You can select version of MXNet by passing a ``framework_version`` keyword arg to the MXNet Estimator constructor. Currently supported versions are listed in the above table. You can also set ``framework_version`` to only specify major and minor version, e.g ``1.2``, which will cause your training script to be run on the latest supported patch version of that minor version, which in this example would be 1.2.1.
819819
Alternatively, you can build your own image by following the instructions in the SageMaker MXNet containers repository, and passing ``image_name`` to the MXNet Estimator constructor.
820820

821-
You can visit the SageMaker MXNet containers repository here: https://github.com/aws/sagemaker-mxnet-containers/
821+
You can visit the SageMaker MXNet containers repository here: https://github.com/aws/sagemaker-mxnet-container

0 commit comments

Comments
 (0)