Skip to content

documentation: add description for parameters in TransformInput #3817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
May 16, 2023
Merged
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
c130f27
Update description for TransformInput()
mwfongAWS Apr 26, 2023
b0cce51
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS Apr 27, 2023
9f740f5
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 1, 2023
e4f8b67
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 2, 2023
47b0b27
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 3, 2023
34ab33d
Fix URLs
mwfongAWS May 5, 2023
5d3a1be
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 5, 2023
8329884
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 8, 2023
fdd98e6
updated some text
mwfongAWS May 9, 2023
07bfb9b
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 9, 2023
3018cbd
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 10, 2023
56cfc8c
pylint errors
mwfongAWS May 10, 2023
537a9a3
Merge branch 'mwfongAWS-SM-sdk' of https://github.com/mwfongAWS/sagem…
mwfongAWS May 10, 2023
7c47a0c
sphinx error
mwfongAWS May 10, 2023
d3cdd3b
indent
mwfongAWS May 10, 2023
0c921e1
undo
mwfongAWS May 10, 2023
ff0a288
test
mwfongAWS May 11, 2023
8d9a82d
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 11, 2023
8f4257c
address suggested changes
mwfongAWS May 11, 2023
d510bca
Merge branch 'mwfongAWS-SM-sdk' of https://github.com/mwfongAWS/sagem…
mwfongAWS May 11, 2023
254f2f1
comment v2
mwfongAWS May 11, 2023
5512f71
comments v3
mwfongAWS May 11, 2023
e81113a
attempt fix
mwfongAWS May 12, 2023
96bdb19
attempt fix
mwfongAWS May 12, 2023
408877c
Update description for TransformInput()
mwfongAWS Apr 26, 2023
ee4fa37
Fix URLs
mwfongAWS May 5, 2023
f5376f8
updated some text
mwfongAWS May 9, 2023
09ad6ea
pylint errors
mwfongAWS May 10, 2023
ca0c740
sphinx error
mwfongAWS May 10, 2023
d4a27a0
indent
mwfongAWS May 10, 2023
a36fb23
undo
mwfongAWS May 10, 2023
ac9c484
test
mwfongAWS May 11, 2023
c14e9ed
address suggested changes
mwfongAWS May 11, 2023
74c5dbe
comment v2
mwfongAWS May 11, 2023
4210ea7
comments v3
mwfongAWS May 11, 2023
3057cba
attempt fix
mwfongAWS May 12, 2023
a267c3e
attempt fix
mwfongAWS May 12, 2023
e5c9202
attempt fix 132
mwfongAWS May 13, 2023
6e5bb4b
merge fix
mwfongAWS May 13, 2023
832d70d
attempt fix
mwfongAWS May 16, 2023
46478d6
Merge branch 'master' into mwfongAWS-SM-sdk
mwfongAWS May 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 60 additions & 1 deletion src/sagemaker/inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,9 +162,68 @@ class CreateModelInput(object):

@attr.s
class TransformInput(object):
"""Create a class containing all the parameters.
"""Creates a class containing parameters for configuring input data for a batch tramsform job.

It can be used when calling ``sagemaker.transformer.Transformer.transform()``

Args:
data (str): The S3 location of the input data that the model can consume.
data_type (str): The data type for a batch transform job.
(default: ``'S3Prefix'``)
content_type (str): The multi-purpose internet email extension (MIME) type of the data.
(default: None)
compression_type (str): If your transform data is compressed, specify the compression type.
Valid values: ``'Gzip'``, ``None``
(default: None)
split_type (str): The method to use to split the transform job's data files into smaller
batches.
Valid values: ``'Line'``, ``RecordIO``, ``'TFRecord'``, None
(default: None)
input_filter (str): A JSONPath expression for selecting a portion of the input data to pass
to the algorithm. For example, you can use this parameter to exclude fields, such as an
ID column, from the input. If you want SageMaker to pass the entire input dataset to the
algorithm, accept the default value ``$``. For more information on batch transform data
processing, input, join, and output, see
`Associate Prediction Results with Input Records
<https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html?>`_
in the *Amazon SageMaker developer guide*.
Example value: ``$``. For more information about valid values for this parameter, see
`JSONPath Operators
<https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html#data-processing-operators>`_
in the *Amazon SageMaker developer guide*.
(default: ``$``)
output_filter (str): A JSONPath expression for selecting a portion of the joined dataset to
save in the output file for a batch transform job. If you want SageMaker to store the
entire input dataset in the output file, leave the default value, $. If you specify
indexes that aren't within the dimension size of the joined dataset, you get an error.
Example value: ``$``. For more information about valid values for this parameter, see
`JSONPath Operators
<https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html#data-processing-operators>`_
in the *Amazon SageMaker developer guide*.
(default: ``$``)
join_source (str): Specifies the source of the data to join with the transformed data.
The default value is ``None``, which specifies not to join the input with the
transformed data. If you want the batch transform job to join the original input data
with the transformed data, set to ``Input``.
Valid values: ``None``, ``Input``
(default: None)
model_client_config (dict): Configures the timeout and maximum number of retries for
processing a transform job invocation.

* ``'InvocationsTimeoutInSeconds'`` (int) - The timeout value in seconds for an
invocation request. The default value is 600.
* ``'InvocationsMaxRetries'`` (int) - The maximum number of retries when invocation
requests are failing.

(default: ``{600,3}``)
batch_data_capture_config (dict): The dict is an object of `BatchDataCaptureConfig
<https://sagemaker.readthedocs.io/en/stable/api/utility/inputs.html#sagemaker.inputs.BatchDataCaptureConfig>`_
and specifies configuration related to batch transform job
for use with Amazon SageMaker Model Monitoring. For more information,
see `Capture data from batch transform job
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture-batch.html>`_
in the *Amazon SageMaker developer guide*.
(default: None)
"""

data: str = attr.ib()
Expand Down