-
Notifications
You must be signed in to change notification settings - Fork 913
GODRIVER-2114 Fix failing KMS TLS tests #712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
EOF | ||
mongo --nodb mock_kms.js | ||
. ./activate_venv.sh | ||
- command: shell.exec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much like the mock OCSP functions, the first command sets up the local environment in the foreground, and the second command starts the Python mock server in the background. These need to be separated for the tests to consistently find the mock KMS server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. https://github.com/evergreen-ci/evergreen/wiki/Project-Commands#shellexec notes:
background: if set to true, does not wait for the script to exit before running the next commands
My new hypothesis for the cause of the connection refused errors:
- Evergreen would run
start-kms-mock-server
and proceed before the script completed. - The Go driver tests started before the mock KMS server started.
Starting the virtual environment in a non-background command before helps. But I think this is still hiding a race.
If the mock KMS server does not establish listening sockets before the Go driver tests run, I suspect the same issue will occur. But, given that the OCSP tasks have a similar setup, I bet the likelihood of the KMS server not starting before the Go tests run is slim to none. If we see it failing in the future, we could consider appending a foreground command to loop until it can establish a connection on port 8000. That seems unnecessary for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds exactly right. I think all the current mock servers in testing (KMS, OCSP and maybe load balancer?) have this racey behavior. It seems that if you only have the server-starting call in the background
function, the tests pretty much never start before the server. So, if we start to see failures we can consider something like a foreground loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Will this need to be cherry-picked on to the 1.7 branch to have tests passing on that branch? If so, can you create a ticket to track this change (description can be brief). That will tie those commits together.
@@ -827,20 +827,18 @@ functions: | |||
|
|||
start-kms-mock-server: | |||
- command: shell.exec | |||
type: test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing type: test
seems right here. The default command_type
on L13 is setup
. If this task fails it will indicate a setup failure, rather than a test failure (https://github.com/evergreen-ci/evergreen/wiki/Project-Configuration-Files#command-failure-colors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah setup
definitely seems like the right type; not sure why I had test
before.
EOF | ||
mongo --nodb mock_kms.js | ||
. ./activate_venv.sh | ||
- command: shell.exec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. https://github.com/evergreen-ci/evergreen/wiki/Project-Commands#shellexec notes:
background: if set to true, does not wait for the script to exit before running the next commands
My new hypothesis for the cause of the connection refused errors:
- Evergreen would run
start-kms-mock-server
and proceed before the script completed. - The Go driver tests started before the mock KMS server started.
Starting the virtual environment in a non-background command before helps. But I think this is still hiding a race.
If the mock KMS server does not establish listening sockets before the Go driver tests run, I suspect the same issue will occur. But, given that the OCSP tasks have a similar setup, I bet the likelihood of the KMS server not starting before the Go tests run is slim to none. If we see it failing in the future, we could consider appending a foreground command to loop until it can establish a connection on port 8000. That seems unnecessary for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed GODRIVER-2114. Let's backport to both release/1.7 and release/1.6 since they both have KMS TLS tests and corresponding Evergreen waterfall tasks.
@@ -827,20 +827,18 @@ functions: | |||
|
|||
start-kms-mock-server: | |||
- command: shell.exec | |||
type: test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah setup
definitely seems like the right type; not sure why I had test
before.
EOF | ||
mongo --nodb mock_kms.js | ||
. ./activate_venv.sh | ||
- command: shell.exec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds exactly right. I think all the current mock servers in testing (KMS, OCSP and maybe load balancer?) have this racey behavior. It seems that if you only have the server-starting call in the background
function, the tests pretty much never start before the server. So, if we start to see failures we can consider something like a foreground loop.
Uses the new
kms_http_server.py
instead of the now-removed, trivialmock_kms.js
.