Skip to content

CDRIVER-4062 test load balancer on evg #837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Aug 12, 2021

Conversation

kevinAlbs
Copy link
Collaborator

@kevinAlbs kevinAlbs commented Jul 30, 2021

Scope

  • Add tasks to the Evergreen configuration based to test against a loadbalanced cluster.
    • Tests with auth+SSL and with no auth+SSL on both server version 5.0 and latest.
    • Runs tests against: new load balanced tests, all unified tests, retryable reads, retryable writes, change streams, and the C driver specific /loadbalanced/* tests.
  • Update test-libmongoc to support testing against a load balancer.
    • Support the environment variable MONGOC_TEST_LOADBALANCED to enable testing against load balancers in test-libmongoc.
    • Support the environment variables SINGLE_MONGOS_LB_URI, MULTI_MONGOS_LB_URI to set URIs for test clients.
    • Mock the serviceId when MONGOC_TEST_LOADBALANCED is true.
    • Support multiple occurrences of the -l flag in test-libmongoc and alias the flag as --match.
    • Remove unnecessary constraint of retryable writes tests for replica sets.
  • CDRIVER-4060 add remaining support to the unified test runner for the new loadbalancer spec tests.
    • Iterate the first result of a FindCursor in the createFindCursor operation.
    • Support appname as a URI parameter.
    • Skip listening or comparing CMAP (Connection Monitoring and Pooling) events, since the C driver does not implement CMAP (per CDRIVER-3525).

Notes for reviewers

  • 73 of the files changed are from updating the JSON files from the specifications repo. Those were reviewed when they were added to the mongodb/specifications repo, so there is no need to review them. The "File Filter" in the review can collapse all of the JSON files.

@kevinAlbs kevinAlbs force-pushed the evergreen.loadbalancer branch 2 times, most recently from f76b6ce to 95408ed Compare August 4, 2021 15:14
@kevinAlbs kevinAlbs changed the title [WIP] Test load balancer on evg CDRIVER-4062 Test load balancer on evg Aug 4, 2021
@kevinAlbs kevinAlbs changed the title CDRIVER-4062 Test load balancer on evg CDRIVER-4062 test load balancer on evg Aug 4, 2021
@@ -440,6 +440,9 @@ functions:
export MONGOC_TEST_GCP_EMAIL="${client_side_encryption_gcp_email}"
export MONGOC_TEST_GCP_PRIVATEKEY="${client_side_encryption_gcp_privatekey}"
fi
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the Evergreen config file is generated from Python scripts. The relevant changes are in tasks.py, functions.py, and variants.py.

@@ -51,11 +51,11 @@ get_mongodb_download_url_for ()
_DISTRO=$1
_VERSION=$2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an updated copy of the file in mongodb-labs/drivers-evergreen-tools.

@@ -657,7 +657,7 @@ def days(n):
], {}, batchtime=days(7)),
Variant ('packaging', 'Linux Distro Packaging', 'ubuntu1804-test', [
'debian-package-build',
OD([('name', 'rpm-package-build'), ('distros', ['rhel80-test'])]),
OD([('name', 'rpm-package-build'), ('distros', ['rhel82-arm64-small'])]),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by fix. The config.yml file was manually updated in CDRIVER-4112 without the python generation files being updated.

@@ -2321,6 +2321,7 @@ _mongoc_cluster_stream_for_server (mongoc_cluster_t *cluster,
mongoc_server_stream_cleanup (server_stream);
mongoc_cluster_disconnect_node (cluster, server_id);
bson_mutex_unlock (&topology->mutex);
_mongoc_bson_init_if_set (reply);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug fix. If a reply argument is passed, callers expect it to always be initialized (even on error).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume no ticket is needed for this since it's fixing a bug from a previous PR that was never released. Correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this was introduced in CDRIVER-4056: 4919d4f#diff-7723202fa2b19410a0b6dd42b6dda8d3708c52f85e7b553977c0bbe324031b41R2299

I do not think a ticket is necessary since CDRIVER-4056 is unreleased and is part of load balancer support.

@@ -137,6 +137,7 @@ TestSuite_Init (TestSuite *suite, const char *name, int argc, char **argv)
suite->flags = 0;
suite->prgname = bson_strdup (argv[0]);
suite->silent = false;
_mongoc_array_init (&suite->match_patterns, sizeof (char *));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, now test-libmongoc can be invoked with multiple occurrences of -l <pattern>. This was necessary to invoke with the multiple subset of tests we need to run against a load balanced deployment.

I also added an alias of --match.

@@ -2774,6 +2774,6 @@ test_change_stream_install (TestSuite *suite)
TestSuite_AddMockServerTest (
suite, "/change_streams/prose_test_18", prose_test_18);

test_framework_resolve_path (JSON_DIR "/change_streams", resolved);
test_framework_resolve_path (JSON_DIR "/change_streams/legacy", resolved);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In resyncing the change_streams tests, there are now two subdirectories (unified and legacy). The unified tests are installed in the unified test runner: runner.c.

@@ -857,36 +863,6 @@ TestSuite_PrintJsonFooter (FILE *stream) /* IN */
fflush (stream);
}


static int
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A distinct TestSuite_RunSerial is no longer needed. If there are no match patterns specified TestSuite_RunAll considers anything to match.

@@ -31,6 +31,11 @@
#define REDUCED_HEARTBEAT_FREQUENCY_MS 500
#define REDUCED_MIN_HEARTBEAT_FREQUENCY_MS 50

struct _entity_findcursor_t {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursors are lazy in libmongoc, so they do not send the initial command (in this case find) until the first iteration.

This updates the implementation of the createFindCursor test operation to adhere to the specification:

https://github.com/mongodb/specifications/blob/86197086c2290a67500321f008b7a0a0685f0da7/source/unified-test-format/unified-test-format.rst#createfindcursor

Test runners for drivers that lazily execute the find command on the first iteration of the cursor MUST iterate the resulting cursor once.

@kevinAlbs kevinAlbs marked this pull request as ready for review August 4, 2021 17:21
@kevinAlbs kevinAlbs requested review from benjirewis and chardan August 4, 2021 18:15
Copy link
Contributor

@chardan chardan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good! There are a few things you might want to double-check. Please note that I don't have a great understanding of the test system and it's hard for me to usefully assess what's going on in the long stretches of this, so you may definitely want a second opinion, but they /look/ ok to me. Lots of effort, nice job!

axes = OD([
('asan', [True]),
('build_ssl', ['openssl']), # The SSL library the C driver is built with.
('test_ssl', [True, False]), # Whether tests are run with SSL connections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are two Booleans there, does this mean "either true or false" or does one flag signify something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinAlbs can confirm, but I think the list of values is used to build the matrix. So we'll end up with LB tests running both with and without SSL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmikola is right. This is to build a matrix to test with SSL and without SSL. This is to satisfy the requirement in the load balancer test readme:

drivers MUST add two Evergreen tasks: one with a sharded cluster with both authentication and TLS enabled and one with a sharded cluster with authentication and TLS disabled.

The _check_allowed function below prevents generating tasks where authentication and TLS do not match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, got it, thanks.

@@ -200,4 +200,6 @@ bool
mongoc_server_description_has_service_id (
const mongoc_server_description_t *description);

extern bool global_mock_service_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is a good and pretty unique identifier, but /should/ it be "mongoc_" or some such?
question: Any concerns about concurrent writes to this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good point! I think otherwise a consumer of libmongoc could have conflict with this symbol since it is not static. I added the mongoc_ prefix and noted with a comment the purpose of this variable.

Any concerns about concurrent writes to this?

I do not think so. The test runner only sets this to true once if running against a load balanced cluster. It is never written to again.

@@ -33,6 +33,8 @@ static bson_oid_t kObjectIdZero = {{0}};

const bson_oid_t kZeroServiceId = {{0}};

bool global_mock_service_id = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 yay, initialization!

}
suite->testname = bson_strdup (argv[++i]);
val = bson_strdup (argv[++i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a corresponding bson_free()?

note: bson_strdup() can return NULL if passed a NULL parameter, but I don't believe that's possible here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a corresponding bson_free()?

Yes, match_patterns is an array of char *. So L187 transfers ownership of val to match_patterns. L1018 frees the values in match_patterns.

note: bson_strdup() can return NULL if passed a NULL parameter, but I don't believe that's possible here.

I don't believe so either since argc was checked.

/* If no match patterns were provided, then assume all match. */
if (suite->match_patterns.len == 0) {
matches = true;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: What you have here is fine, but the function is a bit simpler when the positive if-case just returns and this "else" is the rest of the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinAlbs https://wiki.c2.com/?ElseConsideredSmelly is a good read if you have some spare time

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated and removed the matches variable. An earlier iteration of these changes supported a --skip and --after flag. I removed them because they were unnecessary, but did not clean up this function.

@kevinAlbs wiki.c2.com/?ElseConsideredSmelly is a good read if you have some spare time

Oh thank you, I have added that to my reading list.

if (!test_framework_uri_apply_multi_mongos (
uri, *use_multiple_mongoses, error)) {
goto done;
if (*use_multiple_mongoses && test_framework_is_loadbalanced ()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good time to meditate on "use_multiple_mongoses" and the nature of plurality. ;-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, I am following the specification's lead with useMultipleMongoses. I wouldn't mind useMultipleMongeese.

return _entity_map_add (em, id, "cursor", (void *) cursor, error);
entity_findcursor_t *findcursor;

findcursor =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this bason_malloc() abort()s the process on failure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

test_error ("error applying multiple mongos: %s", error.message);
/* In load balanced mode, the internal client must use the SINGLE_LB_MONGOS_URI. */
if (!test_framework_is_loadbalanced ()) {
/* Always use multiple mongos's if speaking to a mongos.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suspicion is that "mongoses" is probably correct (though odd). There must be a definitive answer! ;-)
https://www.merriam-webster.com/words-at-play/what-happens-to-names-when-we-make-them-plural-or-possessive

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"mongoses" is also used in specs. Consider the useMultipleMongoses test syntax.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating to mongoses.

* supported. */
ret = true;
goto done;
} else if (0 == strcmp (event_type, "command")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've already left via goto, the extra branching isn't needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove extra else.

NULL
};

char ** iter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider declaring this in the loop, rather than at a distance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C89 prohibits declaring within the loop unfortunately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No C17 for us.

@kevinAlbs kevinAlbs requested review from jmikola and removed request for benjirewis August 6, 2021 00:45
axes = OD([
('asan', [True]),
('build_ssl', ['openssl']), # The SSL library the C driver is built with.
('test_ssl', [True, False]), # Whether tests are run with SSL connections.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinAlbs can confirm, but I think the list of values is used to build the matrix. So we'll end up with LB tests running both with and without SSL.

@@ -311,6 +311,9 @@
export MONGOC_TEST_GCP_EMAIL="${client_side_encryption_gcp_email}"
export MONGOC_TEST_GCP_PRIVATEKEY="${client_side_encryption_gcp_privatekey}"
fi
export LOADBALANCED=${LOADBALANCED}
export SINGLE_MONGOS_LB_URI="${SINGLE_MONGOS_LB_URI}"
export MULTI_MONGOS_LB_URI="${MULTI_MONGOS_LB_URI}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these values coming from the Evergreen project configuration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these are Evergreen expansions that are created when the Evergreen task is run.

In config.yml, the "start load balancer" function runs the run-load-balancer.sh script from driver-evergreen-tools. It generates these URIs and writes them to a file on this line. The "start load balancer" function then uses the expansions.update command to load these values as Evergreen expansions.

@@ -2321,6 +2321,7 @@ _mongoc_cluster_stream_for_server (mongoc_cluster_t *cluster,
mongoc_server_stream_cleanup (server_stream);
mongoc_cluster_disconnect_node (cluster, server_id);
bson_mutex_unlock (&topology->mutex);
_mongoc_bson_init_if_set (reply);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume no ticket is needed for this since it's fixing a bug from a previous PR that was never released. Correct?

@@ -365,6 +392,9 @@ entity_client_new (entity_map_t *em, bson_t *bson, bson_error_t *error)
mongoc_apm_set_command_failed_cb (callbacks, command_failed);
} else if (0 == strcmp (event_type, "commandSucceededEvent")) {
mongoc_apm_set_command_succeeded_cb (callbacks, command_succeeded);
} else if (is_unsupported_event_type (event_type)) {
MONGOC_DEBUG ("Skipping observing unsupported event type: %s", event_type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to error here and require the test be explicitly skipped?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue it still has some value to run the test even if we cannot make assertions on CMAP events.

* and should be skipped for drivers that do not."
* TODO: (CDRIVER-3525) add this assertion when CMAP is implemented. */
result_from_ok (result);
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the tests should be skipped, should this just raise an error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still run the test and make other assertions. I have a preference for keeping this as-is.

/* TODO: (CDRIVER-3525) Explicitly ignore cmap events until CMAP is
* supported. */
ret = true;
goto done;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may overlap with my previous question but would it be preferable to error here and require skipping the tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a preference for skipping the CMAP assertions and still running the test. They won't be testing as much, but it seems better to run what we can rather than skip the full test.

@kevinAlbs kevinAlbs requested review from chardan and jmikola August 10, 2021 21:58
Copy link
Contributor

@chardan chardan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With cleanups, +1 from me! Nice work on this.

@kevinAlbs kevinAlbs force-pushed the evergreen.loadbalancer branch from 9b28636 to 808424c Compare August 11, 2021 15:05
@kevinAlbs kevinAlbs merged commit 932e7e4 into mongodb:master Aug 12, 2021
@kevinAlbs kevinAlbs deleted the evergreen.loadbalancer branch August 12, 2021 00:39
@kevinAlbs kevinAlbs restored the evergreen.loadbalancer branch February 3, 2022 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants