Skip to content

CDRIVER-5580 treat commas in TOKEN_RESOURCE as a client error #1950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 28, 2025

Conversation

eramongodb
Copy link
Contributor

@eramongodb eramongodb commented Mar 24, 2025

Related to CDRIVER-5580. Followup to #1914.

Update: focus of the PR has changed to properly forbid unexpected commas in TOKEN_RESOURCE for better consistency with other Drivers; see below.

This PR proposes supporting percent-encoded commas in KVPs of authMechanismProperties as an extension to the Connection String specification, which is contrary to the intended behavior of CDRIVER-5580.


The Connection String spec states:

For any option key-value pair that may contain a comma (such as TOKEN_RESOURCE), drivers MUST document that: a value containing a comma (",") MUST NOT be provided as part of the connection string. This prevents use of values that would interfere with parsing.

Similarly, the Authentication spec states:

Note: because the TOKEN_RESOURCE is often itself a URL, drivers MUST document that a TOKEN_RESOURCE with a comma , must be given as a MongoClient configuration and not as part of the connection string, and that the TOKEN_RESOURCE value can contain a colon : character.

Accordingly, this Connection String spec test requires a percent-encoded comma to be parsed as valid and emit a warning:

description: Comma in a key value pair causes a warning
uri: mongodb://localhost?authMechanism=MONGODB-OIDC&authMechanismProperties=TOKEN_RESOURCE:mongodb://host1%2Chost2,ENVIRONMENT:azure
valid: true
warning: true

A warning is expected by this spec test because the Connection String spec also states (however, note that this paragraph applies to query options, not to key-value pairs):

Any invalid Values for a given key MUST be ignored and MUST log a WARN level message.

There is ambiguity regarding exactly how and why a warning should be emitted by this spec test due to the expected value of authMechanismProperties being unspecified by this spec test:

  • Option 1: Emit a warning due to the presence of a comma in the value of TOKEN_RESOURCE?
    {
      "TOKEN_RESOURCE": "mongodb://host1,host2",
      "ENVIRONMENT:": "azure"
    }
  • Option 2: Emit a warning due to an invalid property following key-value pair (KVP) splitting on the percent-encoded comma?
    {
      "TOKEN_RESOURCE": "mongodb://host1",
      "host2": "",
      "ENVIRONMENT": "azure"
    }

Currently, mongoc favors Option 2 due to percent-decoding a connection string query option's value prior to further parsing according to the lowercased key (rough pseudocode):

# Given `str = "authMechanismProperties=a:1,b:2"`...
kvp = str.split('=')
if kvp.length() < 2:
    handle_error("URI option contains no '=' sign.")

key   = kvp[0]            # "authMechanismProperties"
value = str.join(kvp[1:]) # "a:1,b:2"

# Decode before further analysis.
value = value.decode()
if not value:
    handle_error(f"value for URI option {key} contains invalid UTF-8")

key = key.tolowercase()

# ... unrelated details ...

# Parse the option as an "authmechanismproperties".
mongoc_uri_bson_append_or_replace_key(key, value)

Given Option 2, the properties' KVPs will be split on both the percent-encoded and unencoded commas. The resulting "host2" property key would not be a valid or supported property for the MONGODB-OIDC authentication mechanism, which is now treated as a client error rather than a warning due to the changes in #1896:

MONGOC_URI_ERROR (error, "Unsupported '%s' authentication mechanism property: '%s'", mechanism, key);

(Note the associated comment regarding the Authentication spec overriding the Connection String spec concerning error handling.)

Option 1 can be supported by deferring percent-decoding of the query option's value until after it is parsed as a authMechanismProperties and after the KVPs are correctly split only on unencoded commas in mongoc_uri_parse_auth_mechanism_properties:

// Key-value pairs are delimited by ','.

However, this means that percent-encoded commas would be parsed completely fine without any interference with regular URI connection string parsing or parsing of other KVPs in authMechanismProperties. This makes the warning expected by the spec test seemingly redundant: there is nothing to warn about, as the behavior is well-defined and consistent.


In the spirit of supporting client errors for invalid properties (per the Authentication spec) as well as supporting a better user experience, this PR proposes supporting Option 1 without emitting a warning per the spec test requirements.

As documented in the associated comment in the new mongoc_uri_parse_auth_mechanism_properties_kvp function:

CDRIVER-5580: allow percent-encoded commas to be present, as they do not interfere with parsing due to late percent-decoding of authMechanismProperties. Unencoded commas are always treated as key-value pair delimiters first and cannot be diagnosed as part of a property value.

This deferred percent-decoding behavior is limited only to the authMechanismProperties URI option. This could be extended to other KVP-style options (i.e. readPreferenceTags), but given only TOKEN_RESOURCE is expected to contain commas anyways, other options are left unchanged.

Note the updated documentation for authentication mechanism properties added by #1941 already describe the behavior proposed in this PR, according to its PR description:

The wording is adjusted to indicate that mongoc will interpret the presence of a comma as a key-value pair delimiter. More importantly, in contrast to the spec and associated spec tests, percent-encoded commas will be permitted as an extension to improve user experience, as they do not interfere with the parsing of authMechanismProperties or the connection string as implied by the spec.

The discrepancy in behavior from spec test expectations is handled in test-mongoc-connection-uri.c by the spec test exception handler for "TOKEN_RESOURCE:mongodb://host1%2Chost2". Note: this test has not yet been synced with the repository due to the MONGODB-OIDC authentication mechanism not yet being supported by the URI parser.

Edit: fixed references to Option 1 and Option 2 which were mixed up.

@eramongodb eramongodb requested a review from kevinAlbs March 24, 2025 20:15
@eramongodb eramongodb self-assigned this Mar 24, 2025
@eramongodb eramongodb changed the title CDRIVER-5580 defer percent-decoding of authMechanismProperties until after KVP split CDRIVER-5580 allow percent-encoded commas in TOKEN_RESOURCE Mar 24, 2025
value = bson_strdup (end_key + 1);

if (strcmp (lkey, MONGOC_URI_AUTHMECHANISMPROPERTIES) == 0) {
// Defer percent-decoding until after key-value pairs are split on unencoded ',' in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned this reduces portability of the connection string among other drivers. Example:

bson_error_t error;
const char *uristr = "mongodb://u:p@host/?authMechanism=GSSAPI&authMechanismProperties=SERVICE_HOST%3ahost";
// ----------------------------------------------------------------------------- : percent encoded ^^^
mongoc_uri_t *const uri = mongoc_uri_new_with_error (uristr, &error);
// Before PR  : parses as {'SERVICE_HOST': 'host'}
// After PR   : Error: Unsupported 'GSSAPI' authentication mechanism property: 'SERVICE_HOST%3ahost'
mongoc_uri_destroy (uri);

I expect other drivers passing the spec test also percent decode before splitting. Example in PyMongo:

from pymongo.uri_parser import parse_uri
print(parse_uri("mongodb://u:p@host/?authMechanism=GSSAPI&authMechanismProperties=SERVICE_HOST%3ahost"))
# {'nodelist': [('localhost', 27017)], 'username': 'user', 'password': 'pass', 'database': None, 'collection': None, 'options': {'authMechanism': 'GSSAPI', 'authMechanismProperties': {'SERVICE_HOST': 'host'}}, 'fqdn': None}

Suggest doing option 1 (warn/error on comma) for portability.

@eramongodb
Copy link
Contributor Author

eramongodb commented Mar 27, 2025

Reverted initially suggested changes in favor of treating unexpected commas in TOKEN_RESOURCE (including when percent-encoded) as an error per the Authentication spec (which is stricter than a warning per the Connection String spec) for better consistency with other Drivers.

As described above, the error is due to the invalid comma being treated as a KVP delimiter which produces an invalid mechanism property (e.g. "host2"). The URI documentation merged in #1941 is updated accordingly to avoid indicating support for percent-encoded commas. The comma spec test exception (expect: valid+warning, actual: invalid) is also relocated in the connection string spec test runner accordingly to match our stricter diagnostic behavior.

@eramongodb eramongodb requested a review from kevinAlbs March 27, 2025 16:39
@eramongodb eramongodb changed the title CDRIVER-5580 allow percent-encoded commas in TOKEN_RESOURCE CDRIVER-5580 treat commas in TOKEN_RESOURCE as a client error Mar 27, 2025
@eramongodb eramongodb merged commit 7a05b03 into mongodb:master Mar 28, 2025
40 of 42 checks passed
@eramongodb eramongodb deleted the cdriver-5580-commas branch March 28, 2025 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants