healthcare API: update v1beta1 datasets samples to v1 #3300

noerog · 2020-04-07T23:54:09Z

No description provided.

noerog · 2020-04-08T18:35:38Z

All healthcare tests are now passing, failures are in firestore/cloud-client.

tmatsuo

I went through the first pass and added some comments.

tmatsuo · 2020-04-09T03:53:06Z

healthcare/api-client/v1/dicom/dicom_stores_test.py

+project_id = os.environ['GOOGLE_CLOUD_PROJECT']
+service_account_json = os.environ['GOOGLE_APPLICATION_CREDENTIALS']
+
+dataset_id = 'test_dataset-{}'.format(int(time.time()))


Can you use uuid.uuid4() instead of time.time()?
time.time() can easily conflict between multiple builds.

tmatsuo · 2020-04-09T03:56:36Z

healthcare/api-client/v1/dicom/dicom_stores_test.py

+        bucket)
+
+    # Clean up
+    dicom_stores.delete_dicom_store(


I don't think this clean up is always called. Can you move this to teardown part in another fixuture?

Created new test_dicom_store fixture.

tmatsuo · 2020-04-09T03:56:46Z

healthcare/api-client/v1/dicom/dicom_stores_test.py

+        'serviceAccount:[email protected]',
+        'roles/viewer')
+
+    # Clean up


Created new test_dicom_store fixture.

tmatsuo · 2020-04-09T03:58:15Z

healthcare/api-client/v1/dicom/dicom_stores.py

+
+
+# [START healthcare_dicom_store_set_iam_policy]
+def set_dicom_store_iam_policy(


Here and elsewhere, you have service_account_json as the first argument. Does application default credentials not work?

I think it's for historical reasons in case ADC wasn't set. Would you prefer for it to be removed?

I have preference on ADC because it will be easier for users, but it's up to you. I won't block because of this.

Changed all files to use ADC. I'm inclined to do the same thing with project ID. WDYT?

tmatsuo · 2020-04-09T03:59:56Z

healthcare/api-client/v1/dicom/dicomweb_test.py

+project_id = os.environ['GOOGLE_CLOUD_PROJECT']
+service_account_json = os.environ['GOOGLE_APPLICATION_CREDENTIALS']
+
+dataset_id = 'test_dataset-{}'.format(int(time.time()))


tmatsuo · 2020-04-09T04:03:51Z

healthcare/api-client/v1/fhir/fhir_stores_test.py

+
+    blob.upload_from_filename(resource_file)
+
+    time.sleep(10)  # Give new blob time to propagate


This leads to a test flakiness. Consider a more reliable way.

Using retrying package to check for HttpError with backoff.

tmatsuo · 2020-04-09T04:04:44Z

healthcare/api-client/v1/fhir/fhir_stores_test.py

+        import_object,
+    )
+
+    # Clean up


I'm concerned about having clean-up code in the test body.

Created new test_fhir_store fixture.

tmatsuo · 2020-04-09T04:04:55Z

healthcare/api-client/v1/fhir/fhir_stores_test.py

+        "roles/viewer",
+    )
+
+    # Clean up


Created new test_fhir_store fixture.

tmatsuo · 2020-04-09T04:08:02Z

healthcare/api-client/v1/hl7v2/hl7v2_messages_test.py

+        hl7v2_message_file)
+
+    hl7v2_message_id = ""
+    @eventually_consistent.call


FYI we released gcp-devrel-py-tools 0.0.16. It allows you to have tries argument in the decorator.

@eventually_consistent.call(tries=5)

This will shorten the test time on failures.

Do you recommend adding it here?

It's up to you. It's mostly for convenience during local development.

@noerog We decided to remove gcp-devrel-py-tools from this repo. Sorry about being back and forth. So can you replace it with retrying module? I think you know how (because other code has it), but feel free to ask anything.

Also, if sample code needs retrying, the sample command line tool is also flaky. Does it make sense to have retrying logic in the sample code so that command line tool will almost always work?

I don't think we even need to retry the hl7v2_messages_list test. It was broken when you added the decorator because the API surface changed, not because it was flaky.

tmatsuo · 2020-04-09T04:09:25Z

healthcare/api-client/v1/hl7v2/requirements-test.txt

@@ -0,0 +1,3 @@
+pytest==5.3.2
+gcp-devrel-py-tools==0.0.15


maybe:
gcp-devrel-py-tools==0.0.16

busunkim96 · 2020-04-10T00:17:13Z

healthcare/api-client/v1/datasets/datasets.py

+    discovery_url = '{}?version={}'.format(
+        discovery_api, api_version)


It looks like this API is publicly listed at https://www.googleapis.com/discovery/v1/apis, so you shouldn't need a discovery_url.

Providing the service name and version should be sufficient for building the client.

discovery.build(service_name, api_version, credentials=scoped_credentials)

Done in all files.

tmatsuo

Just responded to your comment. I'll review all the files again tomorrow.

tmatsuo

We decided to remove gcp-devrel-py-tools from this repo. Can you use retrying instead?

@kurtisvg FYI

tmatsuo · 2020-04-10T20:45:03Z

healthcare/api-client/v1/hl7v2/hl7v2_messages_test.py

+        hl7v2_message_file)
+
+    hl7v2_message_id = ""
+    @eventually_consistent.call


@noerog We decided to remove gcp-devrel-py-tools from this repo. Sorry about being back and forth. So can you replace it with retrying module? I think you know how (because other code has it), but feel free to ask anything.

Also, if sample code needs retrying, the sample command line tool is also flaky. Does it make sense to have retrying logic in the sample code so that command line tool will almost always work?

tmatsuo · 2020-04-10T22:49:16Z

@noerog Thanks, this PR is big. I'll go through all the files, first thing in the next week. Sorry for the delay, but I hope it works for you.

noerog · 2020-04-11T00:55:30Z

No problem at all. Just want to make sure you know that this is basically just a branch off the current v1beta1 code in https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/healthcare/api-client but updated for v1. I am very happy to make improvements, but wanted to let you know that the vast majority of this code has already gone through past reviews. Thanks.

I also plan to delete the vast majority of the v1beta1 code once this PR is in and create a v1beta1 directory like the v1 branch here. There are a few features that didn't make it from v1beta1 to v1 that I'd like to keep though.

tmatsuo

I reviewed the files in the datasets directory. I haven't reviewed other files in detail, but I can imagine there are similar issues in other directories too, so I decided to send the review now, rather than later.

I also have a suggestion. How about to separate this PR into multiple PRs, one per sub directory?

tmatsuo · 2020-04-13T17:45:30Z

healthcare/api-client/v1/datasets/datasets.py

@@ -0,0 +1,425 @@
+# Copyright 2018 Google LLC All Rights Reserved.


Is this file just copied from the v1beta directory?
I think even if that's the case, the copyright year should be 2020 because it's a new file. I may be wrong. If you're not sure which is right, do you mind asking OSPO folks?

Updated all .py files to 2020.

tmatsuo · 2020-04-13T17:50:10Z

healthcare/api-client/v1/datasets/datasets.py

+    api_version = 'v1'
+    service_name = 'healthcare'
+
+    credentials = service_account.Credentials.from_service_account_file(


Do you still need to create the credentials manually?

If ADC just works with the client, the client can be create just by

return discovery.build(service_name, api_version)

If you're sure that you need to pass the credentials manually, please ignore this comment.

Fixed here and in all other files where possible. Also updated the doc string to clarify that the credentials come from ADC.

In fhir_resources.py and dicomweb.py we have to use the requests package which requires passing in credentials manually, AFAIK.

tmatsuo · 2020-04-13T17:51:39Z

healthcare/api-client/v1/datasets/datasets.py

+    dataset_parent = 'projects/{}/locations/{}'.format(
+        project_id, cloud_region)
+
+    body = {}


Do you need to create body local variable?

can you just do:

request = client.projects().locations().datasets().create( parent=dataset_parent, body={}, datasetId=dataset_id)

?

Fixed here and in dicom_stores.py.

tmatsuo · 2020-04-13T17:54:17Z

healthcare/api-client/v1/datasets/datasets.py

+        return response
+    except HttpError as e:
+        print('Error, dataset not created: {}'.format(e))
+        return ""


This function returns the response when it succeeds, and an empty string upon failure.

Does it make sense to just omit the return "" statement so that it returns None?

Done here and in all other relevant files.

tmatsuo · 2020-04-13T17:56:01Z

healthcare/api-client/v1/datasets/datasets.py

+        return response
+    except HttpError as e:
+        print('Error, dataset not deleted: {}'.format(e))
+        return ""


tmatsuo · 2020-04-13T18:05:30Z

healthcare/api-client/v1/datasets/datasets_test.py

+cloud_region = 'us-central1'
+project_id = os.environ['GOOGLE_CLOUD_PROJECT']
+
+dataset_id = 'test-dataset-{}'.format(int(time.time()))


Can you use uuid4 instead?

Done, sorry for not catching it in this file.

healthcare/api-client/v1/datasets/datasets_test.py

tmatsuo · 2020-04-13T18:08:52Z

healthcare/api-client/v1/datasets/datasets_test.py

+        time_zone)
+
+    # Clean up
+    datasets.delete_dataset(


Ditto.

Can you make another fixture which actually creates a temporary dataset and delete it upon cleanup?

You may need to have different dataset_id for those fixtures.

tmatsuo · 2020-04-13T18:09:40Z

healthcare/api-client/v1/datasets/datasets_test.py

+    assert 'UTC' in out
+
+
+def test_deidentify_dataset(capsys):


Please utilize a fixture.

Using a fixture for the source dataset and using a fixture for the de-id destination dataset.

tmatsuo · 2020-04-13T18:09:51Z

healthcare/api-client/v1/datasets/datasets_test.py

+    assert 'De-identified data written to' in out
+
+
+def test_get_set_dataset_iam_policy(capsys):


…are only in v1beta1

…f notificationConfigs

noerog · 2020-04-13T21:03:40Z

I removed FHIR files from this PR, they're now in #3384. Will update after I've done the same with DICOM and HL7v2. This PR needs to go in first because the other directories depend on files in datasets/.

tmatsuo

Again I only reviewed files in datasets.

tmatsuo · 2020-04-13T20:59:29Z

healthcare/api-client/v1/datasets/datasets.py

+    api_version = 'v1'
+    service_name = 'healthcare'
+
+    return discovery.build(


Nit: You can have this in one line.

tmatsuo · 2020-04-13T21:00:25Z

healthcare/api-client/v1/datasets/datasets.py

+def create_dataset(
+        project_id,
+        cloud_region,
+        dataset_id):


Does this function fit within 79 chars? If so why not having this in one line?

tmatsuo · 2020-04-13T21:01:32Z

healthcare/api-client/v1/datasets/datasets.py

+
+
+# [START healthcare_delete_dataset]
+def delete_dataset(


tmatsuo · 2020-04-13T21:02:16Z

healthcare/api-client/v1/datasets/datasets.py

+
+
+# [START healthcare_get_dataset]
+def get_dataset(


also this should fit in one line

tmatsuo · 2020-04-13T21:02:59Z

healthcare/api-client/v1/datasets/datasets.py

+
+
+# [START healthcare_patch_dataset]
+def patch_dataset(


also fit within one line

tmatsuo · 2020-04-13T21:04:32Z

healthcare/api-client/v1/datasets/datasets.py

+
+
+# [START healthcare_dataset_get_iam_policy]
+def get_dataset_iam_policy(


Also fit within one line

healthcare/api-client/v1/datasets/datasets_test.py

tmatsuo · 2020-04-13T21:13:59Z

healthcare/api-client/v1/datasets/datasets_test.py

+
+    # Delete the destination_dataset_id which
+    # is created as part of the de-id test.
+    datasets.delete_dataset(


This is unreliable

I added crud_dataset_id and dest_dataset_id fixtures with a retry check for non-404 errors. Only checked for non-404 because I can't think of any other error code we'd want to do the same for. Is this what you were looking for?

noerog · 2020-04-13T22:41:54Z

hl7v2/ PR: #3388
dicom/ PR: #3387
fhir/ PR: #3384

tmatsuo

Thanks, almost there

tmatsuo · 2020-04-14T17:28:29Z

healthcare/api-client/v1/datasets/datasets_test.py

+
+@pytest.fixture(scope="module")
+def test_dataset():
+    dataset = datasets.create_dataset(


Can you add retries for creating/deleting here too?

Done. Interestingly I didn't need to yield the dataset to get the fixture value because "dataset" isn't used anywhere.

healthcare/api-client/v1/datasets/datasets_test.py

tmatsuo

@noerog Thanks!

Sorry I didn't catch it in the previous pass, but I think the callback you're using is not right. See the comment below.

healthcare/api-client/v1/datasets/datasets.py

healthcare/api-client/v1/datasets/datasets_test.py

tmatsuo · 2020-04-14T18:56:04Z

healthcare/api-client/v1/datasets/datasets_test.py

+        wait_exponential_max=10000,
+        stop_max_attempt_number=10)
+    def create():
+        datasets.create_dataset(project_id, cloud_region, dataset_id)


Can you also ignore 409 conflict here?

Oh sorry I was not clear.

The try and exception clause look good, but I'd like it to be retried. Can you keep the @retry and def create(), then inside the create(), add the try-except clause?

Ahh sorry, fixed.

Can you keep the try-except block in all the setup/cleanup code?

tmatsuo · 2020-04-14T18:56:13Z

healthcare/api-client/v1/datasets/datasets_test.py

+        stop_max_attempt_number=10,
+        retry_on_exception=retry_if_server_exception)
+    def clean_up():
+        datasets.delete_dataset(project_id, cloud_region, dataset_id)


please keep @retry

tmatsuo · 2020-04-14T18:56:23Z

healthcare/api-client/v1/datasets/datasets_test.py

+        stop_max_attempt_number=10,
+        retry_on_exception=retry_if_server_exception)
+    def clean_up():
+        datasets.delete_dataset(


tmatsuo · 2020-04-14T18:56:31Z

healthcare/api-client/v1/datasets/datasets_test.py

+        stop_max_attempt_number=10,
+        retry_on_exception=retry_if_server_exception)
+    def clean_up():
+        datasets.delete_dataset(project_id, cloud_region, dataset_id)


tmatsuo

Sorry I was not super clear. I'd like to keep @retry.

tmatsuo · 2020-04-14T21:12:27Z

healthcare/api-client/v1/datasets/datasets_test.py

+        wait_exponential_max=10000,
+        stop_max_attempt_number=10)
+    def create():
+        datasets.create_dataset(project_id, cloud_region, dataset_id)


Oh sorry I was not clear.

The try and exception clause look good, but I'd like it to be retried. Can you keep the @retry and def create(), then inside the create(), add the try-except clause?

tmatsuo · 2020-04-14T21:12:54Z

healthcare/api-client/v1/datasets/datasets_test.py

+        stop_max_attempt_number=10,
+        retry_on_exception=retry_if_server_exception)
+    def clean_up():
+        datasets.delete_dataset(project_id, cloud_region, dataset_id)


please keep @retry

tmatsuo · 2020-04-14T21:13:01Z

healthcare/api-client/v1/datasets/datasets_test.py

+        stop_max_attempt_number=10,
+        retry_on_exception=retry_if_server_exception)
+    def clean_up():
+        datasets.delete_dataset(


tmatsuo · 2020-04-14T21:13:07Z

healthcare/api-client/v1/datasets/datasets_test.py

+        stop_max_attempt_number=10,
+        retry_on_exception=retry_if_server_exception)
+    def clean_up():
+        datasets.delete_dataset(project_id, cloud_region, dataset_id)


tmatsuo

Thanks! It looks great now.

tmatsuo

Sorry, one last change from me.

I think we should remove try-except from at least create_dataset and delete_dataset for robust test setup and clean up.

Also, maybe we should remove try-except from all the sample code because the current code is just throwing away all the information. This is not a good practice anyways.

healthcare/api-client/v1/datasets/datasets_test.py

tmatsuo

Sorry I'm afraid that you misunderstood my requests.
What I asked was to remove try-except from datasets.py, not from datasets_test.py.

I think we need both try-except and retry for setup/cleanup code in datasets_test.py.

If we only retry, it's problematic because

creation fails if the first ceate_dataset failed, but on the server side creation actually succeeded
deletion fails if the test code successfully delete the CRUD dataset
deletion fails if the first delete_dataset failed, but on the server side deletion actually succeeded

tmatsuo · 2020-04-15T18:11:09Z

healthcare/api-client/v1/datasets/datasets_test.py

+        wait_exponential_max=10000,
+        stop_max_attempt_number=10)
+    def create():
+        datasets.create_dataset(project_id, cloud_region, dataset_id)


Can you keep the try-except block in all the setup/cleanup code?

more robust test setup and cleanup

tmatsuo · 2020-04-15T19:31:24Z

@noerog @busunkim96
I added a commit to noerog's branch. PTAL

tmatsuo · 2020-04-15T20:01:59Z

The test is taking a long time because we have significant diff between master and the change base.

@noerog
FYI, rebasing to master often will shorten the build time, you may want to do that for upcoming PRs.

noerog requested a review from a team as a code owner April 7, 2020 23:54

blunderbuss-gcf bot assigned busunkim96 Apr 7, 2020

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Apr 7, 2020

tmatsuo self-requested a review April 8, 2020 17:07

tmatsuo reviewed Apr 9, 2020

View reviewed changes

busunkim96 reviewed Apr 10, 2020

View reviewed changes

tmatsuo reviewed Apr 10, 2020

View reviewed changes

tmatsuo suggested changes Apr 10, 2020

View reviewed changes

tmatsuo suggested changes Apr 13, 2020

View reviewed changes

noerog added 11 commits April 13, 2020 15:01

healthcare API: update all v1beta1 samples to v1

d5dd454

healthcare API: use v1 instead of v1beta1 endpoint

9208bba

healthcare API: remove FHIR conditional methods from v1 because they …

43856c0

…are only in v1beta1

add requirements-test.txt files

a4a40d3

Change test response for searching for studies to 204

00c7dde

fix missing param in execute_bundle.json causing test failures

ca944c7

fix HL7v2 store failure. notificationConfig was deprecated in favor o…

bb633cc

…f notificationConfigs

address review comments

ab2c375

fix lint errors

c8a5ed5

address review comments: use ADC and simplify credentials auth

ab303f4

address review comments

af4634b

noerog mentioned this pull request Apr 13, 2020

healthcare API: update all FHIR v1beta1 samples to v1 #3384

Merged

delete new FHIR files, they are now in #3384

0b7da81

tmatsuo suggested changes Apr 13, 2020

View reviewed changes

noerog mentioned this pull request Apr 13, 2020

healthcare API: update all DICOM v1beta1 samples to v1 #3387

Merged

delete new DICOM files, they are now in #3387

81d8e84

noerog mentioned this pull request Apr 13, 2020

healthcare API: update all HL7v2 v1beta1 samples to v1 #3388

Merged

delete new HL7v2 files, they are now in #3388

f7704e5

noerog changed the title ~~healthcare API: update all v1beta1 samples to v1 and create v1 directory~~ healthcare API: update v1beta1 datasets samples to v1 Apr 13, 2020

address review comments

94f19c7

tmatsuo reviewed Apr 14, 2020

View reviewed changes

noerog added 2 commits April 14, 2020 14:40

address review comments

501d0fa

clean up test_dataset fixture

e14039b

tmatsuo suggested changes Apr 14, 2020

View reviewed changes

address review comments

72c72a6

tmatsuo suggested changes Apr 14, 2020

View reviewed changes

address review comments and add copyright dates

ae623b4

tmatsuo approved these changes Apr 15, 2020

View reviewed changes

tmatsuo suggested changes Apr 15, 2020

View reviewed changes

healthcare/api-client/v1/datasets/datasets_test.py Show resolved Hide resolved

remove try/except blocks, only retries/exponential backoff in tests

e21e335

tmatsuo suggested changes Apr 15, 2020

View reviewed changes

noerog and others added 2 commits April 15, 2020 14:20

fix lint

04e31b9

remove try block from the sample

2d81c2b

more robust test setup and cleanup

tmatsuo approved these changes Apr 15, 2020

View reviewed changes

busunkim96 approved these changes Apr 15, 2020

View reviewed changes

Merge branch 'master' into chc-api-v1

3fe8685

tmatsuo merged commit 5a3cd08 into GoogleCloudPlatform:master Apr 15, 2020



		# [START healthcare_dicom_store_set_iam_policy]
		def set_dicom_store_iam_policy(


		blob.upload_from_filename(resource_file)

		time.sleep(10) # Give new blob time to propagate

		discovery_url = '{}?version={}'.format(
		discovery_api, api_version)

		@@ -0,0 +1,425 @@
		# Copyright 2018 Google LLC All Rights Reserved.

		assert 'De-identified data written to' in out


		def test_get_set_dataset_iam_policy(capsys):



		# [START healthcare_dataset_get_iam_policy]
		def get_dataset_iam_policy(

		@@ -0,0 +1,3 @@
		pytest==5.3.2
		gcp-devrel-py-tools==0.0.15

healthcare API: update v1beta1 datasets samples to v1 #3300

healthcare API: update v1beta1 datasets samples to v1 #3300

Uh oh!

Conversation

noerog commented Apr 7, 2020

Uh oh!

noerog commented Apr 8, 2020

Uh oh!

tmatsuo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmatsuo Apr 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmatsuo left a comment

Choose a reason for hiding this comment

Uh oh!

tmatsuo left a comment

Choose a reason for hiding this comment

Uh oh!

tmatsuo Apr 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmatsuo commented Apr 10, 2020

Uh oh!

noerog commented Apr 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

tmatsuo Apr 10, 2020 •

edited

Loading

tmatsuo Apr 10, 2020 •

edited

Loading

noerog commented Apr 11, 2020 •

edited

Loading