-
Notifications
You must be signed in to change notification settings - Fork 4
PHPLIB-1339 Add tests on Bitwise Query Operators #37
Conversation
- $match: | ||
a: | ||
$bitsAllClear: | ||
- { $binary: { base64: 'TUM9PQ==', subType: '00' } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've no idea what I'm doing with this binary data. This is the Extended JSON representation of the PHP new Binary('MC==')
, which I want to be the equivalent of the JS BinData(0, "ID==")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Binary::__construct()
takes a string argument representing binary data. So if you're constructing this in PHP, you should be doing new Binary(base64_decode(...))
.
I have no idea why the docs use base64 ID==
as the representation of binary 00100000
. Even the shell seems to disagree:
> BinData(0, "ID==")
Binary.createFromBase64("IA==", 0)
Using a tool like aaronluna.dev/base64, you can see that ID==
has the following representation:
ASCII
HEX 2 0
8-BIT 00100000
6-BIT 001000 000011
DECIMAL 8 3
BASE64 I D = =
IA==
is:
ASCII
HEX 2 0
8-BIT 00100000
6-BIT 001000 000000
DECIMAL 8 0
BASE64 I D = =
That suggests ID==
is non-canonical, since it denotes 12 bits of data. Only by truncating the last four bits do you end up with the same value as IA==
. I think this is worth a DOCSP ticket to suggest a change to the server docs.
libbson actually raises an error from _bson_json_parse_binary_elem()
when parsing ID==
(perhaps rightfully so):
Invalid input string "ID==", looking for base64-encoded binary
I'd suggest switching over to IA==
in this example and leaving a comment in the YAML explaining that you modified the value from the docs to use the more canonical representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'm fixing the BSON binary type in #34 using a Yaml tag.
The tag !binary 'IA=='
will be converted to new MongoDB\BSON\Binary(base64_decode('IA=='))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See DOCSP-35733
- $match: | ||
a: | ||
$bitsAllClear: | ||
- { $binary: { base64: 'TUM9PQ==', subType: '00' } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Binary::__construct()
takes a string argument representing binary data. So if you're constructing this in PHP, you should be doing new Binary(base64_decode(...))
.
I have no idea why the docs use base64 ID==
as the representation of binary 00100000
. Even the shell seems to disagree:
> BinData(0, "ID==")
Binary.createFromBase64("IA==", 0)
Using a tool like aaronluna.dev/base64, you can see that ID==
has the following representation:
ASCII
HEX 2 0
8-BIT 00100000
6-BIT 001000 000011
DECIMAL 8 3
BASE64 I D = =
IA==
is:
ASCII
HEX 2 0
8-BIT 00100000
6-BIT 001000 000000
DECIMAL 8 0
BASE64 I D = =
That suggests ID==
is non-canonical, since it denotes 12 bits of data. Only by truncating the last four bits do you end up with the same value as IA==
. I think this is worth a DOCSP ticket to suggest a change to the server docs.
libbson actually raises an error from _bson_json_parse_binary_elem()
when parsing ID==
(perhaps rightfully so):
Invalid input string "ID==", looking for base64-encoded binary
I'd suggest switching over to IA==
in this example and leaving a comment in the YAML explaining that you modified the value from the docs to use the more canonical representation.
- $match: | ||
a: | ||
$bitsAllSet: | ||
- { $binary: { base64: 'TUM9PQ==', subType: '00' } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.mongodb.com/docs/manual/reference/operator/query/bitsAllSet/#bindata-bitmask uses MC==
, which also isn't canonical:
> BinData(0, "MC==")
Binary.createFromBase64("MA==", 0)
I'd suggest including this in your DOCSP ticket as well. I've no idea where the author(s) got these values from.
- $match: | ||
a: | ||
$bitsAnyClear: | ||
- { $binary: { base64: 'TUM9PQ==', subType: '00' } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue:
> BinData(0, "MC==")
Binary.createFromBase64("MA==", 0)
- $match: | ||
a: | ||
$bitsAnySet: | ||
- { $binary: { base64: 'TUM9PQ==', subType: '00' } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue:
> BinData(0, "MC==")
Binary.createFromBase64("MA==", 0)
- $match: | ||
a: | ||
$bitsAllClear: | ||
- !binary 'IA==' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL about YAML binary strings. Since this is a single line and we're not dealing with whitespace, I think it'd be reasonable to use the canonical !!binary
notation.
But I'm also curious if a binary string is even appropriate here. The corresponding example uses a BSON binary type, so shouldn't you be using extended JSON notation here? I think that was in an earlier draft of the PR.
Same question applies to other files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I rename the !binary
tag to !bson_binary
, and the same for other tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these tags are custom on your part (wasn't clear to me if you were re-using existing YAML syntax), I think that'd be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #43
- $match: | ||
a: | ||
# $bitsAllClear: 35 | ||
$bitsAllClear: [35] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of the array? The original Integer Example uses 35
, which seems to be what you have in the comment above.
Same question applies to other files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment.
Example uses the short form, the builder always generates the array form
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I'm being obtuse, but $bitsAllClear: [35]
seems entirely different than $bitsAllClear: 35
. The former would be checking for the 35th bit and the latter checks for the bits according to the binary representation of int(35)
. These are functionally different APIs (for all of these operators).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for insisting, it's fixed.
- $match: | ||
a: | ||
$bitsAllSet: | ||
- !binary 'IA==' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: apart from bitsAllClear
, all of the other examples in the server docs used "MC==", which should be "MA==".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not very relevant for what we are testing, but I fixed. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aye, I don't imagine it affects the test, but thought it may be helpful to keep things in sync with the doc examples.
tests/Builder/Query/Pipelines.php
Outdated
"a": { | ||
"$bitsAllClear": [ | ||
{ | ||
"$binary": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the !binary
syntax in YAML is producing extended JSON for a BSON binary type here. Where does that conversion happen? I'd have expected the YAML to just produce a binary string, which would still be parsed as a string type.
Do any of the YAML templates include extended JSON?
Edit: I just saw https://github.com/mongodb/mongo-php-builder/pull/34/files#diff-a715bdcf88cd771d7ae09bd01863bab314ff194ba28e984689e9040d87846db9R20 and https://github.com/mongodb/mongo-php-builder/pull/34/files#diff-41189b0703bfb03f394530009a7d9c88c508f36a39585abd9f73017d6f77745eR135 from PHPLIB-1350 -- so this appears intentional.
- $match: | ||
a: | ||
# $bitsAllClear: 35 | ||
$bitsAllClear: [35] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I'm being obtuse, but $bitsAllClear: [35]
seems entirely different than $bitsAllClear: 35
. The former would be checking for the 35th bit and the latter checks for the bits according to the binary representation of int(35)
. These are functionally different APIs (for all of these operators).
*/ | ||
public function __construct(Binary|PackedArray|BSONArray|array|int|string $bitmask) | ||
public function __construct(Binary|int|string ...$bitmask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re:
- https://github.com/mongodb/mongo-php-builder/pull/37/files#r1459574672
- https://github.com/mongodb/mongo-php-builder/pull/37/files#r1459323536
I think the problem I'm pointing out in the above threads is due to ambiguity when a single integer is passed here. If 5
is passed to this method, it is interpreted as [5]
, which is not the same.
I think this needs to be reworked to accept either:
- An array of ints
- A single int
- A single BSON Binary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, there are 3 distinct behaviors. I'll remove the variadic and change the type to int|binary|list<int>
.
Fix PHPLIB-1339
https://www.mongodb.com/docs/manual/reference/operator/query/#bitwise
$bitsAllClear
$bitsAllSet
$bitsAnyClear
$bitsAnySet
I made all this operators variadics, as suggested previously for
$type
.