Skip to content

CDRIVER-5667 Add missing BSON type checks in special key handlers #1693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 13, 2024

Conversation

eramongodb
Copy link
Contributor

@eramongodb eramongodb commented Aug 8, 2024

Resolves CDRIVER-5667. Verified by this patch.

Traced the issue to a code pattern discrepancy in _bson_json_read_map_key when handling special keys in the BSON_JSON_IN_BSON_TYPE read state. The BSON type field is reset to BSON_TYPE_EOD (value: 0) when transitioning from BSON_JSON_IN_START_MAP -> BSON_JSON_IN_BSON_TYPE. This PR identified that when subsequently identifying the special key value on re-entry into this function (such as when parsing the key of a subdocument), only some branches performed a check for a pre-existing non-EOD BSON type. Simplified:

if (bson->read_state == BSON_JSON_IN_BSON_TYPE) {
   // Via HANDLE_OPTION expansion.
   if (len == strlen ("$regex") && strncmp ((const char *) val, ("$regex"), len) == 0) {
      if (bson->bson_type && bson->bson_type != BSON_TYPE_REGEX) { // BSON type check. (!)
         _bson_json_read_set_error (...);
         return;
      }
      bson->bson_type = BSON_TYPE_REGEX;
      bson->bson_state = BSON_JSON_LF_REGEX; // LF: "Looking For"
   }

   // Similar pattern for "$options", "$oid", ..., "$numberDecimal".
   else if ...

   // Pattern break: `strcmp` instead of `strlen` + `strncmp`.
   else if (!strcmp ("$timestamp", (const char *) val)) {
      // Pattern break: no BSON type check. (!!)
      bson->bson_type = BSON_TYPE_TIMESTAMP;
      bson->read_state = BSON_JSON_IN_BSON_TYPE_TIMESTAMP_STARTMAP;
   }

   // Similar pattern for "$regularExpression", "$dbPointer", "$code", and "$scope".
   else if ...

   else {
      _bson_json_bad_key_in_type (reader, val);
   }
}

Adding the missing BSON type checks was sufficient to detect and handle the malformed input described in CDRIVER-5667. These checks prevent the state machine from reaching the point where _bson_json_read_append_code is invoked with a precondition violation (the top of JSON stack frame is not a previously-parsed scope document) by detecting+forbidding the presence of any intermediate and unrelated BSON types that could overwrite the current JSON stack frame with their own state.

Some drive-by improvements include:

  • Use of an X macro for special key strings to better identify and maintain the list of special keys (e.g. this PR removes several redundant strlen + memcmp calls for duplicate key checks).
  • Refactor of the strlen + memcmp (formerly strncmp) pattern into a dedicated macro for consistent reuse.
  • Refactor the BSON type check into a macro for consistent reuse.
  • Update JSON error test input values with BSON_STR to reduce need for escape characters.

@eramongodb eramongodb requested a review from kevinAlbs August 8, 2024 21:45
@eramongodb eramongodb self-assigned this Aug 8, 2024
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I am happy to learn "LF" (presumably) means Looking For.

Copy link
Contributor

@vector-of-bool vector-of-bool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix and clean-up! LGTM

Comment on lines +377 to +378
} \
((void) 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an else before the cast-to-void?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I omitted the else because it didn't appear to be necessary. The if is scoped with {}, so there is (AFAIK) no opportunity for else hijacking or other (plausible) mistakes which may produce unexpected behavior at usage sites. A plain ((void)0) seems sufficient to force a semicolon regardless of the macro definition (if or otherwise). If I am incorrect and it is advisable to keep the else anyways, a followup PR can patch it in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is so contained, I don't think there's any risk of anything goofy happening. My only thought was about:

if (x)
    HANDLE_OPTION_TYPE_CHECK(...)
else
    lol();

But that would just be a hard compile-error anyway.

@eramongodb eramongodb merged commit d3d1e53 into mongodb:master Aug 13, 2024
44 of 46 checks passed
@eramongodb eramongodb deleted the cdriver-5667 branch August 13, 2024 17:51
kevinAlbs pushed a commit that referenced this pull request Aug 30, 2024
)

* Use X macro for special keys

* Use consistent key comparison expressions

* CDRIVER-5667 Add missing BSON type checks in special key handlers

* Add regression test to /bson/json/errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants