@@ -29,16 +29,18 @@ class RedshiftDatasetDefinition(ApiObject):
29
29
30
30
Attributes:
31
31
cluster_id (str): The Redshift cluster Identifier.
32
- database (str): The Redshift database created for your cluster.
33
- db_user (str): The user name of a user account that has permission to connect
34
- to the database.
32
+ database (str): The name of the Redshift database used in Redshift query execution.
33
+ db_user (str): The database user name used in Redshift query execution.
35
34
query_string (str): The SQL query statements to be executed.
36
- cluster_role_arn (str): Redshift cluster role arn.
37
- output_s3_uri (str): The path to a specific S3 object or a S3 prefix for output
38
- kms_key_id (str): KMS key id.
39
- output_format (str): the data storage format for Redshift query results.
35
+ cluster_role_arn (str): The IAM role attached to your Redshift cluster that
36
+ Amazon SageMaker uses to generate datasets.
37
+ output_s3_uri (str): The location in Amazon S3 where the Redshift query
38
+ results are stored.
39
+ kms_key_id (str): The AWS Key Management Service (AWS KMS) key that Amazon
40
+ SageMaker uses to encrypt data from a Redshift execution.
41
+ output_format (str): The data storage format for Redshift query results.
40
42
Valid options are "PARQUET", "CSV"
41
- output_compression (str): compression used for Redshift query results.
43
+ output_compression (str): The compression used for Redshift query results.
42
44
Valid options are "None", "GZIP", "SNAPPY", "ZSTD", "BZIP2"
43
45
"""
44
46
@@ -59,15 +61,16 @@ class AthenaDatasetDefinition(ApiObject):
59
61
With this input, SQL queries will be executed using Athena to generate datasets to S3.
60
62
61
63
Attributes:
62
- catalog (str): The name of the data catalog used in query execution.
63
- database (str): The name of the database used in the query execution.
64
- query_string (str): The SQL query statements to be executed.
65
- output_s3_uri (str): the path to a specific S3 object or a S3 prefix for output
66
- work_group (str): The name of the workgroup in which the query is being started.
67
- kms_key_id (str): KMS key id.
68
- output_format (str): the data storage format for Athena query results.
64
+ catalog (str): The name of the data catalog used in Athena query execution.
65
+ database (str): The name of the database used in the Athena query execution.
66
+ query_string (str): The SQL query statements, to be executed.
67
+ output_s3_uri (str): The location in Amazon S3 where Athena query results are stored.
68
+ work_group (str): The name of the workgroup in which the Athena query is being started.
69
+ kms_key_id (str): The AWS Key Management Service (AWS KMS) key that Amazon
70
+ SageMaker uses to encrypt data generated from an Athena query execution.
71
+ output_format (str): The data storage format for Athena query results.
69
72
Valid options are "PARQUET", "ORC", "AVRO", "JSON", "TEXTFILE"
70
- output_compression (str): compression used for Athena query results.
73
+ output_compression (str): The compression used for Athena query results.
71
74
Valid options are "GZIP", "SNAPPY", "ZLIB"
72
75
"""
73
76
@@ -85,15 +88,21 @@ class DatasetDefinition(ApiObject):
85
88
"""DatasetDefinition input.
86
89
87
90
Attributes:
88
- data_distribution_type (str): Valid options are "FullyReplicated" or "ShardedByS3Key".
89
- input_mode (str): Valid options are "Pipe" or "File".
90
- local_path (str): the path to a local directory. If not provided, skips data download by
91
- SageMaker platform.
91
+ data_distribution_type (str): Whether the generated dataset is FullyReplicated or
92
+ ShardedByS3Key (default).
93
+ input_mode (str): Whether to use File or Pipe input mode. In File (default) mode, Amazon
94
+ SageMaker copies the data from the input source onto the local Amazon Elastic Block
95
+ Store (Amazon EBS) volumes before starting your training algorithm. This is the most
96
+ commonly used input mode. In Pipe mode, Amazon SageMaker streams input data from the
97
+ source directly to your algorithm without using the EBS volume.
98
+ local_path (str): The local path where you want Amazon SageMaker to download the Dataset
99
+ Definition inputs to run a processing job. LocalPath is an absolute path to the input
100
+ data. This is a required parameter when `AppManaged` is False (default).
92
101
redshift_dataset_definition
93
102
(:class:`~sagemaker.dataset_definition.RedshiftDatasetDefinition`): Redshift
94
103
dataset definition.
95
104
athena_dataset_definition (:class:`~sagemaker.dataset_definition.AthenaDatasetDefinition`):
96
- Athena dataset definition .
105
+ Configuration for Athena Dataset Definition input .
97
106
"""
98
107
99
108
_custom_boto_types = {
0 commit comments