-
Notifications
You must be signed in to change notification settings - Fork 455
CDRIVER-5580 support embedded URI in connection string options #1914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The NEWS entry for authentication-related credential validation has been summarized into a single line summarizing the related changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Aside: curious if the URI parsing is a good candidate for fuzz testing.
I expect yes, although it would probably need a dictionary to avoid wasting too many cycles on invalid URI keys. |
Summary
Partially resolves CDRIVER-5580 by updating
mongoc_uri_parse
to support embedded URI syntax in query options. More specifically, refactors URI parsing to more correctly support the presence of/
,?
,:
, and,
within the connection string.Additionally fixes
mongoc_uri_parse_auth_mechanism_properties
to correctly use','
as the kvp delimiter and support the presence of':'
in property values.Note
The changes in this PR use MONGODB-OIDC authentication mechanism properties in new regression tests, but it does not add any support for MONGODB-OIDC itself, taking advantage of the lack of thorough validation of auth-related options and properties. However, these changes should be forward-compatible with those currently in review in #1896 which introduce said thorough validation.
Definitions
Given the following connection string adapted from the Connection String spec:
the following terms will be used to describe its parts for purposes of this PR:
mongodb
://
inmongodb://
user:pass
user
pass
:
inuser:pass
@
inuser:pass@
host1:27017,host2:27018
host1:27017
inhost1:27017,host2:27018
host1
inhost1:27017
27017
inhost1:27017
:
inhost:27017
,
inhost1:27017,host2:27018
db
/
in/db
?key1=value1&key2=value2
?
key1=value1
&
inkey1=value1&key2=value2
Problem
Note
For brevity, the scheme and its delimiter will be omitted from connection string examples below.
The
mongoc_uri_parse
function inmongoc-uri.c
parses a connection string and produces amongoc_uri_t
object representing the described URI. Excluding UTF-8 validation, the algorithm currently used may be described by the following pseudocode:When
options
is not empty, it is subsequently parsed for the optional auth database (viamongoc_uri_parse_database
) and query options (viamongoc_uri_parse_options
).A connection string which executes the "split using database delimiter" branch is:
which is split into:
This branch assumes the only
/
that is present in the connection string (after the schema delimiter) MUST be the optional auth database delimiter. To my knowledge, this appears to be a reasonable assumption: no spec test, legacy or unified, contains a/
in the connection string after the scheme delimiter... except forMONGODB-OIDC
spec tests.The
MONGODB-OIDC
authentication mechanism introduced aTOKEN_RESOURCE
authentication mechanism property. The value of this property is expected to be a URI, e.g.mongodb://foo
. This means that a slash may now be present in the query that is not the optional auth database delimiter:This means the connection string above is incorrectly split into:
which results in parse failure.
Solution
The
mongoc_uri_parse
function is refactored to instead use a "cursor" to scan through the connection string and find the point at which to split into the "userhosts" (formerlybefore_slash
) and "dbopts" (formerlyoptions
) sections. The new algorithm may be described by the following pseudocode:The actual code contains thorough comments explaining how each step is expected to modify the position of the cursor through the connection string.
An important detail not included in the pseudocode above is the handling of reserved characters. Per RFC 3986, only a certain set of characters are permitted within each component of the connection string. The exclusion of these reserved characters is not being correctly handled, leading to cases such as
mongodb://us?r:pa?s@localhost
being accepted as valid despite'?'
not being permitted within the userinfo component. This test case has been updated accordingly.The permitted characters in each component are summarized below:
The important takeaway is the absence of
gen-delims
in theuserinfo
,host
, andquery
components, wheregen-delims
is summarized as":/?#[]@"
. These characters are now added to the list of excluded characters in calls toscan_to_unichar
.Note
Total conformance to RFC 3986 is not within the scope of this PR. Forbidding important
gen-delims
characters which have not been explicitly permitted within a given component to minimize the potential for incorrect parse failure is the focus of this PR.Parsing
':'
in authMechanismPropertiesCDRIVER-5580 is originally concerned with ensuring
','
are correctly excluded from the value of key value pairs (kvp) inauthMechanismProperties
. However, this PR is instead more focused on correctly supporting':'
in kvps instead.The
authMechanismProperties
query option has a value which are a list of kvps. Kvps are delimited by','
first, and split into a key and a value by':'
second. For example, given:the resulting BSON document representing the mechanism properties should look as follows:
This requirement that kvps are delimited by
','
is what motivated CDRIVER-5580 to forbid','
in the value of theTOKEN_RESOURCE
property (it can never be parsed in the intended manner). If a URI requires a comma, it must be percent-encoded instead.However, the current implementation of
mongoc_uri_parse_auth_mechanism_properties
incorrectly scans for':'
first, then another scan for','
second, with exclusions for':'
(if an excluded character is found before the terminator, then the result is "none found"). This leads to the above being incorrectly parsed as:This PR fixes this behavior and adds regression tests (
test_uri_uri_in_options
) accordingly.