You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 25, 2022. It is now read-only.
* Add data_type to hyperparameters (aws#54)
When we describe a training job the data type of the hyper parameters is
lost because we use a dict[str, str]. This adds a new field to
Hyperparameter so that we can convert the datatypes at runtime.
instead of validating with isinstance(), we cast the hp value to the type it
is meant to be. This enforces a "strongly typed" value. When we
deserialize from the API string responses it becomes easier to deal with
too.
* Add wrapper for LDA. (aws#56)
Update CHANGELOG and bump the version number.
* Add support for async fit() (aws#59)
when calling fit(wait=False) it will return immediately. The training
job will carry on even if the process exits. by using attach() the
estimator can be retrieved by providing the training job name.
_prepare_init_params_from_job_description() is now a classmethod instead
of being a static method. Each class is responsible to implement their
specific logic to convert a training job description into arguments that
can be passed to its own __init__()
* Fix Estimator role expansion (aws#68)
Instead of manually constructing the role ARN, use the IAM boto client
to do it. This properly expands service-roles and regular roles.
* Add FM and LDA to the documentation. (aws#66)
* Fix description of an argument of sagemaker.session.train (aws#69)
* Fix description of an argument of sagemaker.session.train
'input_config' should be an array which has channel objects.
* Add a link to the botocore docs
* Use 'list' instead of 'array' in the description
* Add ntm algorithm with doc, unit tests, integ tests (aws#73)
* JSON serializer: predictor.predict accepts dictionaries (aws#62)
Add support for serializing python dictionaries to json
Add prediction with dictionary in tf iris integ test
* Fixing timeouts for PCA async integration test. (aws#78)
Execute tf_cifar test without logs to eliminate delay to detect that job has finished.
* Fixes in LinearLearner and unit tests addition. (aws#77)
* Print out billable seconds after training completes (aws#30)
* Added: print out billable seconds after training completes
* Fixed: test_session.py to pass unit tests
* Fixed: removed offending tzlocal()
* Use sagemaker_timestamp when creating endpoint names in integration tests. (aws#81)
* Support TensorFlow-1.5.0 and MXNet-1.0.0 (aws#82)
* Update .gitignore to ignore pytest_cache.
* Support TensorFlow-1.5.0 and MXNet-1.0.0
* Update and refactor tests. Add tests for fw_utils.
* Fix typo.
* Update changelog for 1.1.0 (aws#85)
@@ -97,7 +97,7 @@ SageMaker Python SDK provides several high-level abstractions for working with A
97
97
- **Estimators**: Encapsulate training on SageMaker. Can be ``fit()`` to run training, then the resulting model ``deploy()`` ed to a SageMaker Endpoint.
98
98
- **Models**: Encapsulate built ML models. Can be ``deploy()`` ed to a SageMaker Endpoint.
99
99
- **Predictors**: Provide real-time inference and transformation using Python data-types against a SageMaker Endpoint.
100
-
- **Session**: Provides a collection of convience methods for working with SageMaker resources.
100
+
- **Session**: Provides a collection of convenience methods for working with SageMaker resources.
101
101
102
102
Estimator and Model implementations for MXNet, TensorFlow, and Amazon ML algorithms are included. There's also an Estimator that runs SageMaker compatible custom Docker containers, allowing you to run your own ML algorithms via SageMaker Python SDK.
103
103
@@ -114,6 +114,8 @@ MXNet SageMaker Estimators
114
114
115
115
With MXNet Estimators, you can train and host MXNet models on Amazon SageMaker.
116
116
117
+
Supported versions of MXNet: ``1.0.0``, ``0.12.1``.
118
+
117
119
Training with MXNet
118
120
~~~~~~~~~~~~~~~~~~~
119
121
@@ -185,7 +187,7 @@ If you want to run your training script locally via the Python interpreter, look
185
187
Using MXNet and numpy
186
188
^^^^^^^^^^^^^^^^^^^^^
187
189
188
-
You can import both ``mxnet`` and ``numpy`` in your training script. When your script runs in SageMaker, it will run with access to MXNet version 0.12 and numpy version 1.12.0. For more information on the environment your script runs in, please see `SageMaker MXNet Containers <#sagemaker-mxnet-containers>`__.
190
+
You can import both ``mxnet`` and ``numpy`` in your training script. When your script runs in SageMaker, it will run with access to MXNet version 1.0.0 and numpy version 1.13.3 by default. For more information on the environment your script runs in, please see `SageMaker MXNet Containers <#sagemaker-mxnet-containers>`__.
189
191
190
192
Running an MXNet training script in SageMaker
191
193
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -682,26 +684,33 @@ When training and deploying training scripts, SageMaker runs your Python script
682
684
683
685
SageMaker runs MXNet Estimator scripts in either Python 2.7 or Python 3.5. You can select the Python version by passing a ``py_version`` keyword arg to the MXNet Estimator constructor. Setting this to ``py2`` (the default) will cause your training script to be run on Python 2.7. Setting this to ``py3`` will cause your training script to be run on Python 3.5. This Python version applies to both the Training Job, created by fit, and the Endpoint, created by deploy.
684
686
685
-
Your MXNet training script will be run on version 0.12 of MXNet, built for either GPU or CPU use. The decision to use the GPU or CPU version of MXNet is made by the train_instance_type, set on the MXNet constructor. If you choose a GPU instance type, your training job will be run on a GPU version of MXNet. If you choose a CPU instance type, your training job will be run on a CPU version of MXNet. Similarly, when you call deploy, specifying a GPU or CPU deploy_instance_type, will control which MXNet build your Endpoint runs.
687
+
Your MXNet training script will be run on version 1.0.0 (by default) or 0.12 of MXNet, built for either GPU or CPU use. The decision to use the GPU or CPU version of MXNet is made by the ``train_instance_type``, set on the MXNet constructor. If you choose a GPU instance type, your training job will be run on a GPU version of MXNet. If you choose a CPU instance type, your training job will be run on a CPU version of MXNet. Similarly, when you call deploy, specifying a GPU or CPU deploy_instance_type, will control which MXNet build your Endpoint runs.
686
688
687
-
Each Docker container has the following dependencies installed:
689
+
The Docker images have the following dependencies installed:
688
690
689
-
- Python 2.7 or Python 3.5, depending on the ``py_version`` argument on
690
-
the MXNet constructor.
691
-
- MXNet 0.12, built for either GPU or CPU, depending on the instance
You can select version of MXNet by passing a ``framework_version`` keyword arg to the MXNet Estimator constructor. Currently supported versions are ``1.0.0`` and ``0.12.1``. You can also set ``framework_version`` to ``1.0 (default)`` or ``0.12`` which will cause your training script to be run on the latest supported MXNet 1.0 or 0.12 versions respectively.
704
+
698
705
TensorFlow SageMaker Estimators
699
706
-------------------------------
700
707
701
708
TensorFlow SageMaker Estimators allow you to run your own TensorFlow
702
709
training algorithms on SageMaker Learner, and to host your own TensorFlow
703
710
models on SageMaker Hosting.
704
711
712
+
Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``.
713
+
705
714
Training with TensorFlow
706
715
~~~~~~~~~~~~~~~~~~~~~~~~
707
716
@@ -735,7 +744,7 @@ Preparing the TensorFlow training script
735
744
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
736
745
737
746
Your TensorFlow training script must be a **Python 2.7** source file. The current supported TensorFlow
738
-
version is **1.4.0**. This training script **must contain** the following functions:
747
+
versions are **1.5.0 (default)** and **1.4.1**. This training script **must contain** the following functions:
739
748
740
749
- ``model_fn``: defines the model that will be trained.
741
750
- ``train_input_fn``: preprocess and load training data.
@@ -1150,6 +1159,7 @@ Optional arguments
1150
1159
1151
1160
-``wait (bool)``: Defaults to True, whether to block and wait for the
1152
1161
training script to complete before returning.
1162
+
If set to False, it will return immediately, and can later be attached to.
1153
1163
-``logs (bool)``: Defaults to True, whether to show logs produced by training
1154
1164
job in the Python session. Only meaningful when wait isTrue.
1155
1165
-``run_tensorboard_locally (bool)``: Defaults to False. Executes TensorBoard in a different
@@ -1178,9 +1188,25 @@ the ``TensorFlow`` estimator parameter ``training_steps`` is finished or when th
1178
1188
job execution time reaches the ``TensorFlow`` estimator parameter ``train_max_run``.
1179
1189
1180
1190
When the training job finishes, a `TensorFlow serving <https://www.tensorflow.org/serving/serving_basic>`_
1181
-
with the result of the training is generated and saved to the S3 location define by
1191
+
with the result of the training is generated and saved to the S3 location defined by
1182
1192
the ``TensorFlow`` estimator parameter ``output_path``.
1183
1193
1194
+
1195
+
If the ``wait=False`` flag is passed to ``fit``, then it will return immediately. The training job will continue running
1196
+
asynchronously. At a later time, a Tensorflow Estimator can be obtained by attaching to the existing training job. If
1197
+
the training job isnot finished it will start showing the standard output of training and wait until it completes.
1198
+
After attaching, the estimator can be deployed as usual.
You can select version of TensorFlow by passing a ``framework_version`` keyword arg to the TensorFlow Estimator constructor. Currently supported versions are ``1.5.0`` and ``1.4.1``. You can also set ``framework_version`` to ``1.5 (default)`` or ``1.4`` which will cause your training script to be run on the latest supported TensorFlow 1.5 or 1.4 versions respectively.
1424
1474
1425
1475
AWS SageMaker Estimators
1426
1476
------------------------
1427
1477
Amazon SageMaker provides several built-in machine learning algorithms that you can use for a variety of problem types.
1428
1478
1429
1479
The full list of algorithms is available on the AWS website: https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
1430
1480
1431
-
SageMaker Python SDK includes Estimator wrappers for the AWS K-means, Principal Components Analysis, and Linear Learner algorithms.
1481
+
SageMaker Python SDK includes Estimator wrappers for the AWS K-means, Principal Components Analysis(PCA), Linear Learner, Factorization Machines, Latent Dirichlet Allocation(LDA) and Neural Topic Model(NTM) algorithms.
1432
1482
1433
1483
Definition and usage
1434
1484
~~~~~~~~~~~~~~~~~~~~
1435
-
Estimators that wrap Amazon's built-in algorithms define algorithm's hyperparameters with defaults. When a default is not possible you need to provide the value during construction:
1485
+
Estimators that wrap Amazon's built-in algorithms define algorithm's hyperparameters with defaults. When a default is not possible you need to provide the value during construction, e.g.:
1436
1486
1437
1487
- ``KMeans`` Estimator requires parameter ``k`` to define number of clusters
1438
1488
- ``PCA`` Estimator requires parameter ``num_components`` to define number of principal components
0 commit comments