Skip to content

Commit 5542dd3

Browse files
authored
Updated information on data preprocessing
1 parent aba170e commit 5542dd3

File tree

1 file changed

+11
-8
lines changed

1 file changed

+11
-8
lines changed

doc/workflows/kubernetes/using_amazon_sagemaker_components.rst

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -463,21 +463,24 @@ you can create your classification pipeline. To create your pipeline,
463463
you need to define and compile it. You then deploy it and use it to run
464464
workflows. You can define your pipeline in Python and use the KFP
465465
dashboard, KFP CLI, or Python SDK to compile, deploy, and run your
466-
workflows.
466+
workflows. The full code for the MNIST classification pipeline example is available in the
467+
`Kubeflow Github
468+
repository <https://github.com/kubeflow/pipelines/blob/master/samples/contrib/aws-samples/mnist-kmeans-sagemaker>`__.
469+
To use it, clone the example Python files to your gateway node.
467470

468471
Prepare datasets
469472
~~~~~~~~~~~~~~~~
470473

471-
To run the pipelines, you need to have the datasets in an S3 bucket in
472-
your account. This bucket must be located in the region where you want
473-
to run Amazon SageMaker jobs. If you don’t have a bucket, create one
474+
To run the pipelines, you need to upload the data extraction pre-processing script to an S3 bucket. This bucket and all resources for this example must be located in the ``us-east-1`` Amazon Region. If you don’t have a bucket, create one
474475
using the steps in `Creating a
475476
bucket <https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html>`__.
476477

477-
From your gateway node, run the `sample dataset
478-
creation <https://github.com/kubeflow/pipelines/tree/34615cb19edfacf9f4d9f2417e9254d52dd53474/samples/contrib/aws-samples/mnist-kmeans-sagemaker#the-sample-dataset>`__
479-
script to copy the datasets into your bucket. Change the bucket name in
480-
the script to the one you created.
478+
From the ``mnist-kmeans-sagemaker`` folder of the Kubeflow repository you cloned on your gateway node, run the following command to upload the ``kmeans_preprocessing.py`` file to your S3 bucket. Change ``<bucket-name>`` to the name of the S3 bucket you created.
479+
480+
::
481+
482+
aws s3 cp mnist-kmeans-sagemaker/kmeans_preprocessing.py s3://<bucket-name>/mnist_kmeans_example/processing_code/kmeans_preprocessing.py
483+
481484

482485
Create a Kubeflow Pipeline using Amazon SageMaker Components
483486
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)