Skip to content

Fix broken test test_distributed_mnist_no_ps #156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 28, 2019

Conversation

icywang86rui
Copy link
Contributor

This test shouldn't save checkpoints since the two hosts are justing running
training jobs independently. The checkpoints interfere with each other. Changing
the test to use the Keras mnist script here.

This change also changed the saved model path to /opt/ml/opt so we can just use
the estimator.model_data path to assert the model exists.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This test shouldn't save checkpoints since the two hosts are justing running
training jobs independently. The checkpoints interfere with each other. Changing
the test to use the Keras mnist script here.

This change also changed the saved model path to /opt/ml/opt so we can just use
the estimator.model_data path to assert the model exists.
@icywang86rui icywang86rui requested a review from yangaws January 28, 2019 18:53
@icywang86rui icywang86rui merged commit ec07c35 into aws:script-mode Jan 28, 2019
Elizaaaaa pushed a commit to Elizaaaaa/sagemaker-tensorflow-container that referenced this pull request Nov 4, 2019
This test shouldn't save checkpoints since the two hosts are justing running
training jobs independently. The checkpoints interfere with each other. Changing
the test to use the Keras mnist script here.

This change also changed the saved model path to /opt/ml/opt so we can just use
the estimator.model_data path to assert the model exists.
Elizaaaaa pushed a commit to Elizaaaaa/sagemaker-tensorflow-container that referenced this pull request Nov 4, 2019
This test shouldn't save checkpoints since the two hosts are justing running
training jobs independently. The checkpoints interfere with each other. Changing
the test to use the Keras mnist script here.

This change also changed the saved model path to /opt/ml/opt so we can just use
the estimator.model_data path to assert the model exists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants