Skip to content

CDRIVER-3895: Remove dots/dollars validation for insert/replace docs #801

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jun 16, 2021

Conversation

jmikola
Copy link
Member

@jmikola jmikola commented Jun 8, 2021

jmikola added 14 commits June 11, 2021 18:50
Also replaces CRUD legacy/v2 tests with unified equivalents for CDRIVER-3960.

Synced with mongodb/specifications@a124e21
This adds top-level key validation to _mongoc_validate_replace, similar to what existed in _mongoc_validate_update.
Replacement docs will already be validated by _mongoc_validate_replace before this function is reached.

This adds logic to ensure "q" and "u" documents are present, similar to what existed in _mongoc_write_command_delete_legacy.
The failure code path is testing that arguments are still validated by mongoc-util.c, despite replacement validation being removed in a previous commit.
Empty keys can be used as a reliable BSON validation error, since insert, replace, and update all specify BSON_VALIDATE_EMPTY_KEYS in their vflags. Error message expectations are adjusted accordingly.

An _id document with a dollar-prefixed key is also used as a reliable server-side validation error.
Allow entire files to be skipped to preempt schema version errors.
@jmikola jmikola changed the title [WIP] CDRIVER-3895: Remove dots/dollars validation for insert/replace docs CDRIVER-3895: Remove dots/dollars validation for insert/replace docs Jun 11, 2021
@jmikola jmikola marked this pull request as ready for review June 11, 2021 22:52
@jmikola jmikola requested review from benjirewis, kevinAlbs and alcaeus and removed request for benjirewis and kevinAlbs June 11, 2021 22:52
@@ -149,7 +149,7 @@
]
},
"collection": "test",
"batchSize": 3
"batchSize": 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change reverts #793, which manually modified the test file to work around a deviation in libmongoc's behavior.

Since the original test fails, this is skipped in runner.c. It's not clear to me if this pertains more to DRIVERS-1781 or DRIVERS-1448, so I'd appreciate @alcaeus's input here. I did confirm that this test fails with both MongoDB 4.4 and 5.0, so I don't think it's related to the slot-based execution changes in 5.0.

const char *test_description;
} skipped_unified_test_t;

#define SKIP_ALL_TESTS NULL
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I could have used this for unacknowledged-bulkWrite-delete-hint-clientError and related tests, I decided not to as we know exactly what tests should be skipped there. This was really only necessary for cases where we can't even parse tests from a file due to an unsupported schema version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having one place to list explicitly skipped tests, with an option to skip all tests in one file is helpful.

{"unacknowledged-updateOne-hint-clientError", "Unacknowledged updateOne with hint string fails with client-side error"},
{"unacknowledged-updateOne-hint-clientError", "Unacknowledged updateOne with hint document fails with client-side error"},
/* CDRIVER-4001, DRIVERS-1781, and DRIVERS-1448: 5.0 cursor behavior */
{"poc-command-monitoring", "A successful find event with a getmore and the server kills the cursor"},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pertains to my earlier comment about reverting #793 and skipping the original test from the specs repo.

/* upsertedIds is a required field in BulkWriteResult, so append an
* empty document even if not documents were upserted. */
upserted_ids = bson_new ();
BSON_APPEND_DOCUMENT (write_result, "upsertedIds", upserted_ids);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to have been an oversight in the original test runner implementation.

}

mongoc_bulk_operation_insert_with_opts (
bulk, document, bson_parser_get_extra (parser), error);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code made it impossible to evaluate expectError for these methods, since any error interrupted the test runner's execution (like a parsing error or unsupported operation). This change is consistent with how other public API methods are invoked.

ASSERT_ERROR_CONTAINS (error,
MONGOC_ERROR_COMMAND,
MONGOC_ERROR_COMMAND_INVALID_ARG,
"keys cannot contain \".\": \"a.b\"");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was nothing to replace this test with, since we already use empty keys earlier and invalid UTF-8 isn't an option (due to how tmp_bson parses JSON).


const bson_validate_flags_t _mongoc_default_replace_vflags =
BSON_VALIDATE_UTF8 | BSON_VALIDATE_UTF8_ALLOW_NULL |
BSON_VALIDATE_EMPTY_KEYS | BSON_VALIDATE_DOT_KEYS |
BSON_VALIDATE_DOLLAR_KEYS;
BSON_VALIDATE_EMPTY_KEYS;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change all three vflags are identical. The vflags aren't part of the public API, so we could have consolidated them into one value; however, I think leaving them as-is is fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, leaving them as-is seems OK.

MONGOC_ERROR_COMMAND,
MONGOC_ERROR_COMMAND_INVALID_ARG,
"Invalid key '%s': replace prohibits $ operators",
key);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the inverse of the existing logic in _mongoc_validate_update. We technically only need to check the first key since the server would raise an error (as discussed in the CRUD spec), but I opted to keep this the same as update validation. If anyone feels strongly about this, feel free to open a new CDRIVER ticket to propose a future change to both validation methods.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I can see not to validate all fields is performance. But the previous behavior was checking for all fields (including nested fields) with bson_validate_with_error, and performance was not a known issue. So I am OK with checking all top-level fields.

bson_iter_find (&subiter, "u") &&
BSON_ITER_HOLDS_DOCUMENT (&subiter));

BSON_ASSERT (r);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking that both fields exist is necessary since we don't explicitly fetch the fields later. For whatever reason, _mongoc_write_command_update_legacy iterates through all fields and could fail to fully construct the command if one was absent. In practice, I doubt that would ever happen since the BSON for this is constructed elsewhere in libmongoc and doesn't originate from the user.

Note: I opted not to test for this assert failure, since it really isn't possible to trigger it from the public API.

@@ -27,6 +27,7 @@ typedef enum {
BSON_PARSER_BOOL,
BSON_PARSER_DOC,
BSON_PARSER_ARRAY,
BSON_PARSER_ARRAY_OR_DOC,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was necessary for accepting documents or pipeline arrays for update operations.

@jmikola
Copy link
Member Author

jmikola commented Jun 15, 2021

Evergreen looks good apart from the known failures. I think this is ready for a full review.

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


const bson_validate_flags_t _mongoc_default_replace_vflags =
BSON_VALIDATE_UTF8 | BSON_VALIDATE_UTF8_ALLOW_NULL |
BSON_VALIDATE_EMPTY_KEYS | BSON_VALIDATE_DOT_KEYS |
BSON_VALIDATE_DOLLAR_KEYS;
BSON_VALIDATE_EMPTY_KEYS;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, leaving them as-is seems OK.

MONGOC_ERROR_COMMAND,
MONGOC_ERROR_COMMAND_INVALID_ARG,
"Invalid key '%s': replace prohibits $ operators",
key);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I can see not to validate all fields is performance. But the previous behavior was checking for all fields (including nested fields) with bson_validate_with_error, and performance was not a known issue. So I am OK with checking all top-level fields.

@@ -459,6 +430,10 @@ _mongoc_write_command_update_legacy (mongoc_write_command_t *command,
while (bson_iter_next (&subiter)) {
if (strcmp (bson_iter_key (&subiter), "u") == 0) {
bson_iter_document (&subiter, &len, &data);

BSON_ASSERT (data);
BSON_ASSERT (len >= 5);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Asserting the document is at least as large as an empty document is a good safety check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shamelessly copied from _mongoc_write_command_delete_legacy.


r = mongoc_collection_insert_one (
collection, tmp_bson ("{'a.b': 1}"), NULL, NULL, &error);
collection, tmp_bson ("{'': 1}"), NULL, NULL, &error);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. An embedded NULL would not work with tmp_bson either. Besides, checking for valid UTF-8 keys is outside of the scope of this ticket.

@@ -454,7 +454,7 @@ entity_database_new (entity_map_t *entity_map,
if (coll_or_db_opts->rp) {
mongoc_database_set_read_prefs (db, coll_or_db_opts->rp);
}
if (coll_or_db_opts->rc) {
if (coll_or_db_opts->wc) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find :)


/* clang-format off */
skipped_unified_test_t SKIPPED_TESTS[] = {
/* CDRIVER-3630: libmongoc does not unconditionally raise an error when using
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, IIRC we only raise an error if the server version indicates that hint is not supported.

const char *test_description;
} skipped_unified_test_t;

#define SKIP_ALL_TESTS NULL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having one place to list explicitly skipped tests, with an option to skip all tests in one file is helpful.

#define SKIP_ALL_TESTS NULL

/* clang-format off */
skipped_unified_test_t SKIPPED_TESTS[] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you for adding this!

Copy link
Contributor

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (just some opt nits). Helpful for the Go implementation 🧑‍🔧

err,
MONGOC_ERROR_COMMAND,
MONGOC_ERROR_COMMAND_INVALID_ARG,
"Invalid key '$set': replace prohibits $ operators");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[opt] Looks like these error messages were lowercase before, but I'm not sure what the norm is in the C driver.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous error messages originated in libbson (see: bson.c). The message here is consistent with what this file already does for update validation:

Invalid key '%s': update only works with $ operators and pipelines

@jmikola jmikola merged commit 2597f6d into mongodb:master Jun 16, 2021
@jmikola jmikola deleted the cdriver-3895 branch June 16, 2021 15:32
jmikola added a commit that referenced this pull request Jun 21, 2021
…801)

* CDRIVER-3895: Update CRUD, transactions, and unified spec tests

Also replaces CRUD legacy/v2 tests with unified equivalents for CDRIVER-3960.

Synced with mongodb/specifications@a124e21

* Skip unified CRUD tests related to CDRIVER-3630

* Ensure upsertedIds is always present in BulkWriteResult

* Allow array or document for updateOne/Many update arg

* Remove dot/dollar restrictions for insert/replace validation

This adds top-level key validation to _mongoc_validate_replace, similar to what existed in _mongoc_validate_update.

* Remove redundant replace doc validation for legacy updates

Replacement docs will already be validated by _mongoc_validate_replace before this function is reached.

This adds logic to ensure "q" and "u" documents are present, similar to what existed in _mongoc_write_command_delete_legacy.

* Test success and failure for legacy update code path

The failure code path is testing that arguments are still validated by mongoc-util.c, despite replacement validation being removed in a previous commit.

* Revert "Skip tests with dots and dollars in field names on 5.0+ (#795)"

This reverts commit bd014ee.

* Revise bulk and collection tests for allowing dots/dollars

Empty keys can be used as a reliable BSON validation error, since insert, replace, and update all specify BSON_VALIDATE_EMPTY_KEYS in their vflags. Error message expectations are adjusted accordingly.

An _id document with a dollar-prefixed key is also used as a reliable server-side validation error.

* Allow methods in bulk_op_append to fail so expectError can be evaluated

* Skip unified tests due to schema version and outstanding issues

Allow entire files to be skipped to preempt schema version errors.

* Ensure WC is set on database and collection entities

Co-authored-by: Kevin Albertson <[email protected]>
@jmikola
Copy link
Member Author

jmikola commented Jun 21, 2021

Cherry-picked to r1.18 in 4069b79

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants