Skip to content

Commit 6677cb7

Browse files
committed
update docstring to be consistent with aws document
1 parent b754bf9 commit 6677cb7

File tree

1 file changed

+30
-21
lines changed

1 file changed

+30
-21
lines changed

src/sagemaker/dataset_definition/inputs.py

Lines changed: 30 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -29,16 +29,18 @@ class RedshiftDatasetDefinition(ApiObject):
2929
3030
Attributes:
3131
cluster_id (str): The Redshift cluster Identifier.
32-
database (str): The Redshift database created for your cluster.
33-
db_user (str): The user name of a user account that has permission to connect
34-
to the database.
32+
database (str): The name of the Redshift database used in Redshift query execution.
33+
db_user (str): The database user name used in Redshift query execution.
3534
query_string (str): The SQL query statements to be executed.
36-
cluster_role_arn (str): Redshift cluster role arn.
37-
output_s3_uri (str): The path to a specific S3 object or a S3 prefix for output
38-
kms_key_id (str): KMS key id.
39-
output_format (str): the data storage format for Redshift query results.
35+
cluster_role_arn (str): The IAM role attached to your Redshift cluster that
36+
Amazon SageMaker uses to generate datasets.
37+
output_s3_uri (str): The location in Amazon S3 where the Redshift query
38+
results are stored.
39+
kms_key_id (str): The AWS Key Management Service (AWS KMS) key that Amazon
40+
SageMaker uses to encrypt data from a Redshift execution.
41+
output_format (str): The data storage format for Redshift query results.
4042
Valid options are "PARQUET", "CSV"
41-
output_compression (str): compression used for Redshift query results.
43+
output_compression (str): The compression used for Redshift query results.
4244
Valid options are "None", "GZIP", "SNAPPY", "ZSTD", "BZIP2"
4345
"""
4446

@@ -59,15 +61,16 @@ class AthenaDatasetDefinition(ApiObject):
5961
With this input, SQL queries will be executed using Athena to generate datasets to S3.
6062
6163
Attributes:
62-
catalog (str): The name of the data catalog used in query execution.
63-
database (str): The name of the database used in the query execution.
64-
query_string (str): The SQL query statements to be executed.
65-
output_s3_uri (str): the path to a specific S3 object or a S3 prefix for output
66-
work_group (str): The name of the workgroup in which the query is being started.
67-
kms_key_id (str): KMS key id.
68-
output_format (str): the data storage format for Athena query results.
64+
catalog (str): The name of the data catalog used in Athena query execution.
65+
database (str): The name of the database used in the Athena query execution.
66+
query_string (str): The SQL query statements, to be executed.
67+
output_s3_uri (str): The location in Amazon S3 where Athena query results are stored.
68+
work_group (str): The name of the workgroup in which the Athena query is being started.
69+
kms_key_id (str): The AWS Key Management Service (AWS KMS) key that Amazon
70+
SageMaker uses to encrypt data generated from an Athena query execution.
71+
output_format (str): The data storage format for Athena query results.
6972
Valid options are "PARQUET", "ORC", "AVRO", "JSON", "TEXTFILE"
70-
output_compression (str): compression used for Athena query results.
73+
output_compression (str): The compression used for Athena query results.
7174
Valid options are "GZIP", "SNAPPY", "ZLIB"
7275
"""
7376

@@ -85,15 +88,21 @@ class DatasetDefinition(ApiObject):
8588
"""DatasetDefinition input.
8689
8790
Attributes:
88-
data_distribution_type (str): Valid options are "FullyReplicated" or "ShardedByS3Key".
89-
input_mode (str): Valid options are "Pipe" or "File".
90-
local_path (str): the path to a local directory. If not provided, skips data download by
91-
SageMaker platform.
91+
data_distribution_type (str): Whether the generated dataset is FullyReplicated or
92+
ShardedByS3Key (default).
93+
input_mode (str): Whether to use File or Pipe input mode. In File (default) mode, Amazon
94+
SageMaker copies the data from the input source onto the local Amazon Elastic Block
95+
Store (Amazon EBS) volumes before starting your training algorithm. This is the most
96+
commonly used input mode. In Pipe mode, Amazon SageMaker streams input data from the
97+
source directly to your algorithm without using the EBS volume.
98+
local_path (str): The local path where you want Amazon SageMaker to download the Dataset
99+
Definition inputs to run a processing job. LocalPath is an absolute path to the input
100+
data. This is a required parameter when `AppManaged` is False (default).
92101
redshift_dataset_definition
93102
(:class:`~sagemaker.dataset_definition.RedshiftDatasetDefinition`): Redshift
94103
dataset definition.
95104
athena_dataset_definition (:class:`~sagemaker.dataset_definition.AthenaDatasetDefinition`):
96-
Athena dataset definition.
105+
Configuration for Athena Dataset Definition input.
97106
"""
98107

99108
_custom_boto_types = {

0 commit comments

Comments
 (0)