Skip to content

Commit c64fc79

Browse files
jmikolakevinAlbs
andauthored
DRIVERS-1621: Prohibit null bytes in document field names and regex components (#1051)
* DRIVERS-1621: Prohibit null bytes in document field names and regex components * Clarify "encoding BSON" for prose tests * Fix RST syntax error * Clarify type-specific rules for parseErrors * Do not require drivers to validate parseErrors strings as regular JSON Co-authored-by: Kevin Albertson <[email protected]>
1 parent 1ff5fac commit c64fc79

File tree

4 files changed

+98
-27
lines changed

4 files changed

+98
-27
lines changed

source/bson-corpus/bson-corpus.rst

Lines changed: 72 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ BSON Corpus
99
:Status: Approved
1010
:Type: Standards
1111
:Minimum Server Version: N/A
12-
:Last Modified: July 20, 2017
13-
:Version: 2.0
12+
:Last Modified: September 2, 2021
13+
:Version: 2.1
1414

1515
.. contents::
1616

@@ -140,7 +140,7 @@ additional assertions. For each case, keys include:
140140
JSON document. Because this is itself embedded as a *string* inside a JSON
141141
document, characters like quote and backslash are escaped. It may be
142142
present for deprecated types and is the Canonical Extended JSON
143-
representation of ``converted_bson`.
143+
representation of ``converted_bson``.
144144

145145
* ``lossy`` (optional) -- boolean; present (and true) iff ``canonical_bson``
146146
can't be represented exactly with extended JSON (e.g. NaN with a payload).
@@ -167,15 +167,6 @@ be encoded to the ``bson_type`` under test. For each case, keys include:
167167
* ``string``: a text or numeric representation of an input that can't be
168168
parsed to a valid value of the given type.
169169

170-
Drivers MUST parse the extended JSON input using a regular JSON parser
171-
(not an extended JSON one) and verify the input is parsed successfully.
172-
This serves to verify that the parse error test cases test extended
173-
JSON-specific error conditions and that they do not have,
174-
for example, unintended spelling errors.
175-
176-
Drivers SHOULD parse the extended JSON input using the extended JSON parser
177-
and verify the parsing produces an extended JSON parse error.
178-
179170
Extended JSON encoding, escaping and ordering
180171
---------------------------------------------
181172

@@ -314,21 +305,48 @@ manner.
314305
Testing parsing errors
315306
----------------------
316307

317-
The interpretation of ``parseErrors`` is type-specific. For example,
318-
helpers for creating Decimal128 values may parse strings to convert them
319-
to binary Decimal128 values. The ``parseErrors`` cases are strings that
320-
will *not* convert correctly.
308+
The interpretation of ``parseErrors`` is type-specific. The structure of test
309+
cases within ``parseErrors`` is described in `Parse error case keys`_.
321310

322-
The documentation for a type (if any) will specify how to use these
323-
cases for testing.
311+
Drivers SHOULD test that each case results in a parsing error (e.g. parsing
312+
Extended JSON, constructing a language type). Implementations MAY test
313+
assertions in an implementation-specific manner.
324314

325-
For type "0x00" (i.e. top-level documents), the ``parseErrors`` entries have a
326-
``description`` field and an ``string`` field. Parsing the ``string`` field
327-
as Extended JSON MUST result in an error.
328315

329-
Drivers SHOULD test that each case results in a parse error.
330-
Implementations MAY test assertions in an implementation-specific
331-
manner.
316+
Top-level Document (type 0x00)
317+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
318+
319+
For type "0x00" (i.e. top-level documents), the ``string`` field contains input
320+
for an Extended JSON parser. Drivers MUST parse the Extended JSON input using an
321+
Extended JSON parser and verify that doing so yields an Extended JSON parsing
322+
error.
323+
324+
Drivers SHOULD also parse the Extended JSON input using a regular JSON parser (not
325+
an Extended JSON one) and verify the input is parsed successfully. This serves
326+
to verify that the ``parseErrors`` test cases are testing Extended JSON-specific
327+
error conditions and that they do not have, for example, unintended syntax
328+
errors.
329+
330+
Note: due to the generic nature of these tests, they may also be used to test
331+
Extended JSON parsing errors for various BSON types appearing within a document.
332+
333+
334+
Binary (type 0x05)
335+
~~~~~~~~~~~~~~~~~~
336+
337+
For type "0x05" (i.e. binary), the rules for handling ``parseErrors`` are the
338+
same as those for `Top-level Document (type 0x00)`_.
339+
340+
341+
Decimal128 (type 0x13)
342+
~~~~~~~~~~~~~~~~~~~~~~
343+
344+
For type "0x13" (i.e. Decimal128), the ``string`` field contains input for a
345+
Decimal128 parser that converts string input to a binary Decimal128 value (e.g.
346+
Decimal128 constructor). Drivers MUST assert that these strings cannot be
347+
successfully converted to a binary Decimal128 value and that parsing the string
348+
produces an error.
349+
332350

333351
Deprecated types
334352
----------------
@@ -338,6 +356,29 @@ Implementations MAY ignore or modify them to match legacy treatment of
338356
deprecated types. The ``converted_bson`` and ``converted_extjson`` fields MAY
339357
be used to test conversion to a standard type or MAY be ignored.
340358

359+
Prose Tests
360+
===========
361+
362+
The following tests have not yet been automated, but MUST still be tested.
363+
364+
1. Prohibit null bytes in null-terminated strings when encoding BSON
365+
--------------------------------------------------------------------
366+
367+
The BSON spec uses null-terminated strings to represent document field names and
368+
regex components (i.e. pattern and flags/options). Drivers MUST assert that null
369+
bytes are prohibited in the following contexts when encoding BSON (i.e. creating
370+
raw BSON bytes or constructing BSON-specific type classes):
371+
372+
* Field name within a root document
373+
* Field name within a sub-document
374+
* Pattern for a regular expression
375+
* Flags/options for a regular expression
376+
377+
Depending on how drivers implement BSON encoding, they MAY expect an error when
378+
constructing a type class (e.g. BSON Document or Regex class) or when encoding a
379+
language representation to BSON (e.g. converting a dictionary, which might allow
380+
null bytes in its keys, to raw BSON bytes).
381+
341382
Implementation Notes
342383
====================
343384

@@ -456,6 +497,13 @@ assertions. This makes for easier and safer test case development.
456497
Changes
457498
=======
458499

500+
Version 2.1 - September 2, 2021
501+
502+
* Add spec and prose tests for prohibiting null bytes in null-terminated strings
503+
within document field names and regular expressions.
504+
505+
* Clarify type-specific rules for ``parseErrors``.
506+
459507
Version 2.0 - May 26, 2017
460508

461509
* Revised to be consistent with Extended JSON spec 2.0: valid case fields

source/bson-corpus/tests/document.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,10 @@
5151
{
5252
"description": "Invalid subdocument: bad string length in field",
5353
"bson": "1C00000003666F6F001200000002626172000500000062617A000000"
54+
},
55+
{
56+
"description": "Null byte in sub-document key",
57+
"bson": "150000000378000D00000010610000010000000000"
5458
}
5559
]
5660
}

source/bson-corpus/tests/regex.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,11 +54,11 @@
5454
],
5555
"decodeErrors": [
5656
{
57-
"description": "embedded null in pattern",
57+
"description": "Null byte in pattern string",
5858
"bson": "0F0000000B610061006300696D0000"
5959
},
6060
{
61-
"description": "embedded null in flags",
61+
"description": "Null byte in flags string",
6262
"bson": "100000000B61006162630069006D0000"
6363
}
6464
]

source/bson-corpus/tests/top.json

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,10 @@
7979
{
8080
"description": "Document truncated mid-key",
8181
"bson": "1200000002666F"
82+
},
83+
{
84+
"description": "Null byte in document key",
85+
"bson": "0D000000107800000100000000"
8286
}
8387
],
8488
"parseErrors": [
@@ -241,7 +245,22 @@
241245
{
242246
"description": "Bad DBpointer (extra field)",
243247
"string": "{\"a\": {\"$dbPointer\": {\"a\": {\"$numberInt\": \"1\"}, \"$id\": {\"$oid\": \"56e1fc72e0c917e9c4714161\"}, \"c\": {\"$numberInt\": \"2\"}, \"$ref\": \"b\"}}}"
248+
},
249+
{
250+
"description" : "Null byte in document key",
251+
"string" : "{\"a\\u0000\": 1 }"
252+
},
253+
{
254+
"description" : "Null byte in sub-document key",
255+
"string" : "{\"a\" : {\"b\\u0000\": 1 }}"
256+
},
257+
{
258+
"description": "Null byte in $regularExpression pattern",
259+
"string": "{\"a\" : {\"$regularExpression\" : { \"pattern\": \"b\\u0000\", \"options\" : \"i\"}}}"
260+
},
261+
{
262+
"description": "Null byte in $regularExpression options",
263+
"string": "{\"a\" : {\"$regularExpression\" : { \"pattern\": \"b\", \"options\" : \"i\\u0000\"}}}"
244264
}
245-
246265
]
247266
}

0 commit comments

Comments
 (0)