Skip to content

PHPC-459: BSON objects should implement JsonSerializable #454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 16, 2016

Conversation

jmikola
Copy link
Member

@jmikola jmikola commented Nov 7, 2016

@jmikola jmikola force-pushed the phpc-459 branch 2 times, most recently from 09ed633 to d82ab95 Compare November 7, 2016 21:04
@jmikola jmikola changed the title [WIP] PHPC-459: BSON objects should implement JsonSerializable PHPC-459: BSON objects should implement JsonSerializable Nov 7, 2016

has_scope = (intern->scope && intern->scope->len);
array_init_size(return_value, has_scope ? 2 : 1);
ADD_ASSOC_STRINGL(return_value, "$code", intern->code, intern->code_len);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is consistent with the current Extended JSON spec. As noted in the tests, libbson currently converts Javascript types to JSON by returning its code field as a string value (see: CDRIVER-1335). It also does not properly parse $code and $scope fields when converting JSON to BSON, which causes the JSON we produce here to be read by libbson as a normal embedded document.

The HHVM implementation currently mimics libbson (see: HHVM-156); however, I think we should change its behavior to comply with the spec rather than libbson, which should itself be fixed in 1.6.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened HHVM-266 to track this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, I'll make a PR for HHVM-266 soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what we should do here. In my opinion, we need to pick something now, and stick with it. Change the format would be an API change, and I'd be reluctant to change that before a 2.0 release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on a response from the C driver team. I imagine they may keep their non-compliant output format (i.e. code as a string and ignoring scope) for BC until 2.0 but allow the $code/$scope format to be parsed properly when they address the ticket in 1.6.0.

I'm all in favor of sticking to the spec and returning a compliant value today, even if libbson cannot parse it yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that libbson cannot currently roundtrip Javascript types anyway (they'd be parsed back as a BSON string), so there is no risk by sticking with the proper $code/$scope format.


array_init_size(return_value, 2);
ADD_ASSOC_STRINGL(return_value, "$regex", intern->pattern, intern->pattern_len);
ADD_ASSOC_STRINGL(return_value, "$options", intern->flags, intern->flags_len);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment in DRIVERS-331 asking whether we should alphabetically order the regex flag characters (as should be done for raw BSON to support binary comparisons). The implementation here leaves the flags as-is.

If we do start alphabetically ordering the characters in the future, we'll likely do so when the Regex object is initialized and this code shouldn't need to change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relevant PHPC ticket for flag ordering is PHPC-829.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And HHVM is at HHVM-264.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should order the flags in the constructor/initialiser only. That should be all what we need to do about this I believe.

The HHVM ticket is HHVM-264.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should order the flags in the constructor/initialiser only

I concur. Nothing to do here and we have our respective tickets.


array_init_size(&ts, 2);
ADD_ASSOC_LONG_EX(&ts, "t", intern->timestamp);
ADD_ASSOC_LONG_EX(&ts, "i", intern->increment);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Extended JSON spec currently states that a Timestamp should yield:

{"$timestamp": {"t": integer, "i": integer}}

We're still discussing whether these values should be single strings or perhaps {"$numberLong": string} objects. We cannot guarantee portability with integers, since the values may fall outside the range of 32-bit signed integers (both values are unsigned 32-bit integers).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened PHPC-828 to track this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HHVM does the same format too:

        return [
            '$timestamp' => [
                't' => (int) $this->timestamp,
                'i' => (int) $this->increment,
            ]
        ];

So nothing we should change there - right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like with JavaScript, we need to make a decision now. We won't be able to change this until 2.0 otherwise due to BC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will follow up with @llvtt in chat later and ask for this to be address in the spec ASAP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After speaking with @llvtt, it's clear that the spec is not far enough along to make an assumption about the Timestamp format.

@derickr: Would you rather we throw an exception for out-of-range values on 32-bit systems or instead allow the unsigned values to be printed in their signed representation? I can re-purpose PHPC-828 if necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an exception is great here, as that means that some random data can suddenly abort a script. I think for now, showing their signed representation is "good enough". We can fix that, if the ext-json spec has been finalised.


array_init_size(&udt, 1);
ADD_ASSOC_STRINGL(&udt, "$numberLong", s_milliseconds, s_milliseconds_len);
ADD_ASSOC_ZVAL_EX(return_value, "$date", &udt);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Extended JSON spec currently states that a UTCDateTime should yield:

{"$date": {"$numberLong": "<ms since epoch as an integer>"}}

Alternative, optional format:

{"$date": {"$isoDate": <ISO-8601 date string in UTC as a string>}}

I'm opting to use the first format as ISO-8601 has issues representing the full range of dates supported by the BSON date type (as discussed during the previous driver team meeting). Additionally, the $numberLong format is much easier to generate.

HHVM currently yields the $isoDate format, which I think we should change. Discussion is ongoing in the spec about whether drivers should always yield $numberLong and optionally provide a $isoDate purely for user readability purposes (i.e. it will never be parsed for the authoritative value).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened HHVM-267 to track this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that we need to change this now, as we had the exact same discussion in the HHVM PR, where we decided to do it that way.

The team meeting notes only say:

Our JsonSerializable output should be in a format that libbson and read and convert back into BSON. This means that we may not be able to use formats in the Extended JSON spec just yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that libbson currently supports an ISO 8601 date as a string value for $date (BSON_JSON_LF_DATE) and a 64-bit integer timestamp as a $numberLong object within the $date value (BSON_JSON_IN_BSON_TYPE_DATE_NUMBERLONG). It does not currently support the $isoDate nesting object for $date.

Since $numberLong is already supported, superior to the ISO string (in the values it can represent), and has a high chance of ending up in the Extended JSON spec, I think that's worth implementing.

Will follow up with @llvtt in chat later and ask for this to be address in the spec ASAP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After speaking to @llvtt, it's likely that BSON dates will rely on $numberLong representation. $isoDate will likely end up as an optional field allowed to co-exist alongside $numberLong in some fashion, but it will be purely informational for the benefit of human readability.

I will submit a PR to modify HHVM's behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@@ -0,0 +1,19 @@
--TEST--
MongoDB\BSON\Binary::jsonSerialize() return value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you just copy all of my tests and added these ==DONE== things for all of them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "return value" tests are new. The "json_encode()" tests are based on your own (with different code formatting and ===DONE=== blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -401,6 +440,7 @@ PHP_MINIT_FUNCTION(Binary)

zend_class_implements(php_phongo_binary_ce TSRMLS_CC, 1, php_phongo_type_ce);
zend_class_implements(php_phongo_binary_ce TSRMLS_CC, 1, zend_ce_serializable);
zend_class_implements(php_phongo_binary_ce TSRMLS_CC, 1, php_json_serializable_ce);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pedant in me would like this ordered alphabetically ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

PHP_METHOD(Decimal128, jsonSerialize)
{
php_phongo_decimal128_t *intern;
char outbuf[BSON_DECIMAL128_STRING] = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of initialising this with a empty string in the code segment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously done in php_phongo_decimal128_get_properties() as insurance that the string was null-terminated even if bson_decimal128_to_string() decided not to touch it. Based on my reading of bson_decimal128_to_string(), does always assign a null-terminated string to its output buffer, but it wasn't documented as doing so and I wanted to be defensive. The alternative was a memset() on the array, but I didn't think that was necessary.

__toString() and serialize() currently use char outbuf[BSON_DECIMAL128_STRING] without any assignment, but I wanted to update those later.


has_scope = (intern->scope && intern->scope->len);
array_init_size(return_value, has_scope ? 2 : 1);
ADD_ASSOC_STRINGL(return_value, "$code", intern->code, intern->code_len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what we should do here. In my opinion, we need to pick something now, and stick with it. Change the format would be an API change, and I'd be reluctant to change that before a 2.0 release.

intern = Z_JAVASCRIPT_OBJ_P(getThis());

has_scope = (intern->scope && intern->scope->len);
array_init_size(return_value, has_scope ? 2 : 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bother with the trinary here. Just make it 2 - it's merely a hint to the underlying algorithm anyway. And in that case, you can get rid of the has_scope variable, and just do the test in the if() below.


array_init_size(return_value, 2);
ADD_ASSOC_STRINGL(return_value, "$regex", intern->pattern, intern->pattern_len);
ADD_ASSOC_STRINGL(return_value, "$options", intern->flags, intern->flags_len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should order the flags in the constructor/initialiser only. That should be all what we need to do about this I believe.

The HHVM ticket is HHVM-264.


array_init_size(&ts, 2);
ADD_ASSOC_LONG_EX(&ts, "t", intern->timestamp);
ADD_ASSOC_LONG_EX(&ts, "i", intern->increment);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like with JavaScript, we need to make a decision now. We won't be able to change this until 2.0 otherwise due to BC.


array_init_size(&udt, 1);
ADD_ASSOC_STRINGL(&udt, "$numberLong", s_milliseconds, s_milliseconds_len);
ADD_ASSOC_ZVAL_EX(return_value, "$date", &udt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that we need to change this now, as we had the exact same discussion in the HHVM PR, where we decided to do it that way.

The team meeting notes only say:

Our JsonSerializable output should be in a format that libbson and read and convert back into BSON. This means that we may not be able to use formats in the Extended JSON spec just yet.

Copy link
Contributor

@derickr derickr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but PRs to "fix" HHVM would be lovely too ;-)


array_init_size(&ts, 2);
ADD_ASSOC_LONG_EX(&ts, "t", intern->timestamp);
ADD_ASSOC_LONG_EX(&ts, "i", intern->increment);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an exception is great here, as that means that some random data can suddenly abort a script. I think for now, showing their signed representation is "good enough". We can fix that, if the ext-json spec has been finalised.


array_init_size(&udt, 1);
ADD_ASSOC_STRINGL(&udt, "$numberLong", s_milliseconds, s_milliseconds_len);
ADD_ASSOC_ZVAL_EX(return_value, "$date", &udt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@@ -0,0 +1,19 @@
--TEST--
MongoDB\BSON\Binary::jsonSerialize() return value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@jmikola jmikola merged commit 96f24c9 into mongodb:master Nov 16, 2016
jmikola added a commit that referenced this pull request Nov 16, 2016
@jmikola jmikola deleted the phpc-459 branch November 16, 2016 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants