Skip to content

infra: generate test job name at test start instead of module start #1345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 10, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions tests/integ/test_auto_ml.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
from __future__ import absolute_import

import os
import time

import pytest
import tests.integ
Expand All @@ -34,7 +33,7 @@
TRAINING_DATA = os.path.join(DATA_DIR, "iris_training.csv")
TEST_DATA = os.path.join(DATA_DIR, "iris_test.csv")
PROBLEM_TYPE = "MultiClassClassification"
JOB_NAME = "auto-ml-{}".format(time.strftime("%y%m%d-%H%M%S"))
BASE_JOB_NAME = "auto-ml"

# use a succeeded AutoML job to test describe and list candidates method, otherwise tests will run too long
AUTO_ML_JOB_NAME = "python-sdk-integ-test-base-job"
Expand Down Expand Up @@ -119,11 +118,11 @@ def test_auto_ml_fit_optional_args(sagemaker_session):
)
inputs = TRAINING_DATA
with timeout(minutes=AUTO_ML_DEFAULT_TIMEMOUT_MINUTES):
auto_ml.fit(inputs, job_name=JOB_NAME)
auto_ml.fit(inputs, job_name=unique_name_from_base(BASE_JOB_NAME))

auto_ml_desc = auto_ml.describe_auto_ml_job(job_name=JOB_NAME)
auto_ml_desc = auto_ml.describe_auto_ml_job(job_name=auto_ml.latest_auto_ml_job.job_name)
assert auto_ml_desc["AutoMLJobStatus"] == "Completed"
assert auto_ml_desc["AutoMLJobName"] == JOB_NAME
assert auto_ml_desc["AutoMLJobName"] == auto_ml.latest_auto_ml_job.job_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I feel like it would make more sense to save job_name separately before calling fit, and then asserting that the job name we get later equals that. don't feel strongly about it, though, because I can't clearly articulate why that's my instinct here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that and chose this version purposefully.
If .fit() ever mutates the job name for whatever reason, the source of truth would be auto_ml.latest_job.name . Having this reference come from the object ensures it's closer to the source.
I think this is more reliable for the future, even though realistically, there's no difference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my argument would be that if we were to change the logic around how fit() handles a specified job name, we would want our tests that are expecting otherwise to break. Otherwise we're just verifying the job name without a strong expectation of what the job name should be.

To be fair, though, I also don't think the integ tests are the right place for this kind of verification now that I think about it - it's much more of a "business logic to be tested by unit tests" kind of thing 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind both of those arguments =)

assert auto_ml_desc["AutoMLJobObjective"] == job_objective
assert auto_ml_desc["ProblemType"] == problem_type
assert auto_ml_desc["OutputDataConfig"]["S3OutputPath"] == output_path
Expand Down