Skip to content

Updated: README with BlazingText and videogames to use prefix #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 2, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ These examples provide quick walkthroughs to get you up and running with Amazon
- [XGBoost for regression](introduction_to_amazon_algorithms/xgboost_abalone) predicts the age of abalone ([Abalone dataset](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html)) using regression from Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost).
- [XGBoost for multi-class classification](introduction_to_amazon_algorithms/xgboost_mnist) uses Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost) to classify handwritten digits from the MNIST dataset as one of the ten digits using a multi-class classifier. Both single machine and distributed use-cases are presented.
- [DeepAR for time series forecasting](introduction_to_amazon_algorithms/deepar_synthetic) illustrates how to use the Amazon SageMaker DeepAR algorithm for time series forecasting on a synthetically generated data set.
- [BlazingText Word2Vec](introduction_to_amazon_algorithms/blazingtext_word2vec_text8) generates Word2Vec embeddings from a cleaned text dump of Uncyclopedia articles using SageMaker's fast and scalable BlazingText implementation.

### Scientific Details of Algorithms

Expand Down Expand Up @@ -71,7 +72,7 @@ These examples focus on the Amazon SageMaker Python SDK which allows you to writ

These examples show how to use Amazon SageMaker for model training, hosting, and inference through Apache Spark using [SageMaker Spark](https://github.com/aws/sagemaker-spark). SageMaker Spark allows you to interleave Spark Pipeline stages with Pipeline stages that interact with Amazon SageMaker.

- [MNIST with SageMaker Spark](sagemaker-spark/pyspark_mnist)
- [MNIST with SageMaker PySpark](sagemaker-spark/pyspark_mnist)

### Under Development

Expand Down
3 changes: 2 additions & 1 deletion introduction_to_amazon_algorithms/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ These examples provide quick walkthroughs to get you up and running with Amazon
- [Image Classification](introduction_to_amazon_algorithms/imageclassification_caltech) includes full training and transfer learning examples of Amazon SageMaker's Image Classification algorithm. This uses a ResNet deep convolutional neural network to classify images from the caltech dataset.
- [XGBoost for regression](xgboost_abalone) predicts the age of abalone ([Abalone dataset](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html)) using regression from Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost).
- [XGBoost for multi-class classification](xgboost_mnist) uses Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost) to classifiy handwritten digits from the MNIST dataset as one of the ten digits using a multi-class classifier. Both single machine and distributed use-cases are presented.
- [DeepAR for time series forecasting](deepar_synthetic) illustrates how to use the Amazon SageMaker DeepAR algorithm for time series forecasting on a synthetically generated data set.
- [DeepAR for time series forecasting](deepar_synthetic) illustrates how to use the Amazon SageMaker DeepAR algorithm for time series forecasting on a synthetically generated data set.
- [BlazingText Word2Vec](blazingtext_word2vec_text8) generates Word2Vec embeddings from a cleaned text dump of Uncyclopedia articles using SageMaker's fast and scalable BlazingText implementation.
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@
"---\n",
"## Data\n",
"\n",
"Before proceeding further, you'll need to sign in to Kaggle or create a Kaggle account if you don't have one. Then **upload the raw CSV data set from the above Kaggle link to the S3 bucket you specified above**. The raw_data_filename specified below is the name of the data file from Kaggle, but you should alter it if the name changes. Let's download the data from your S3 bucket to your notebook instance, where it will appear in the same directory as this notebook. Then we'll take an initial look at the data."
"Before proceeding further, you'll need to sign in to Kaggle or create a Kaggle account if you don't have one. Then **upload the raw CSV data set from the above Kaggle link to the S3 bucket and prefix you specified above**. The raw_data_filename specified below is the name of the data file from Kaggle, but you should alter it if the name changes. Let's download the data from your S3 bucket to your notebook instance, where it will appear in the same directory as this notebook. Then we'll take an initial look at the data."
]
},
{
Expand All @@ -107,7 +107,7 @@
"raw_data_filename = 'Video_Games_Sales_as_at_22_Dec_2016.csv'\n",
"\n",
"s3 = boto3.resource('s3')\n",
"s3.Bucket(bucket).download_file(raw_data_filename, 'raw_data.csv')\n",
"s3.Bucket(bucket).download_file(prefix + '/' + raw_data_filename, 'raw_data.csv')\n",
"\n",
"data = pd.read_csv('./raw_data.csv')\n",
"pd.set_option('display.max_rows', 20) \n",
Expand Down
8 changes: 3 additions & 5 deletions sagemaker-spark/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
# Amazon SageMaker Examples

## Amazon SageMaker using SageMaker Spark
### Using Amazon SageMaker with Apache Spark

These examples demonstrate using SageMaker Spark, which allows you to interleave Apache Spark code with Amazon SageMaker.
These examples show how to use Amazon SageMaker for model training, hosting, and inference through Apache Spark using [SageMaker Spark](https://github.com/aws/sagemaker-spark). SageMaker Spark allows you to interleave Spark Pipeline stages with Pipeline stages that interact with Amazon SageMaker.

See [SageMaker Spark](https://github.com/aws/sagemaker-spark) for more information.

- [MNIST with SageMaker Spark](pyspark_mnist)
- [MNIST with SageMaker PySpark](pyspark_mnist)