-
Notifications
You must be signed in to change notification settings - Fork 258
feat!(NODE-4410): only enumerate own properties #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f06d897
to
57b02e9
Compare
57b02e9
to
06ed7c7
Compare
)[n] = options[n as keyof DeserializeOptions]; | ||
} | ||
arrayOptions['raw'] = true; | ||
arrayOptions = { ...options, raw: true }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a change we want to make?
iiuc, this line of code copies the options before recursing (presumably so we're not mutating a shared options object in recursive deserialize calls). The for-in loop here is only used to copy the options object, not the document being deserialized.
My understanding of the intention behind this change is that when serializing objects, we only want to serialize own properties on the object so we don't pull in keys from the prototype of the object. This seems unrelated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is related because it has similar surprising side effects when there are keys that you didn't specify yourself effecting the outcomes of your deserialize calls. I tried demonstrating the possible buggy behavior in the test in test/node/parser/deserializer.test.ts
. If the global object prototype has been polluted with unexpected keys they can control the behavior of the BSON library without actually needing access to the options object a user passes in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. This impacts all other options too, no?
import { deserialize, serialize } from "./src/bson";
const bytes = serialize({ someKey: {
nested: "value"
} });
const opts = {}
Object.setPrototypeOf(opts, { raw: true });
const result = deserialize(bytes, opts);
console.log(result);
// { someKey: <Buffer 17 00 00 00 02 6e 65 73 74 65 64 00 06 00 00 00 76 61 6c 75 65 00 00> }
It seems like what we'd want to do is only look for own properties on the options objects too when accessing individual fields (not in scope for this work). My 2-cents would be to remove this from this PR and file a follow up to only consider options that are set as own properties (probably an easy fix by making a shallow clone of the options before we begin any work deserializing / serializing).
If you want to leave this in this PR, that's also okay because technically the scope of the ticket is "Use Object.keys to enumerate keys on a JS object passed to BSON", and this is an instance where we're using a for-in loop. If we leave this work in this PR, we should still file a follow up ticket I think. And I'll comment below, but I think we can make the test easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the fix here, but I agree about following up. And yea happy to improve the test, let me know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me. I left a comment about the test below
test/node/parser/serializer.test.ts
Outdated
expect(Array.from(bytes)).to.include('a'.charCodeAt(0)); | ||
expect(Array.from(bytes)).to.not.include('b'.charCodeAt(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead assert equality on the exact serialized buffer (a bson document with exactly one key of a
with value of 1
)? We have the buffer from hex array helper, so we could easily do something like:
expect(bytes).to.equal(bufferFromHexArray([
// relevant hex values here
]);
test/node/extended_json.test.ts
Outdated
it('should only enumerate own property keys from input objects', () => { | ||
const input = { a: 1 }; | ||
Object.setPrototypeOf(input, { b: 2 }); | ||
const string = EJSON.stringify(input); | ||
expect(string).to.include(`"a":`); | ||
expect(string).to.not.include(`"b":`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here as in serializer.test.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little different, I can hardcode the JSON as is done in other tests here but the issue I've seen crop up a number of times when working on EJSON is the spacing changes which doesn't actually mean the format really changed. Currently this test isn't space sensitive, ideally I'd like us to add a JSON equal checker that is better at being agnostic about that stuff, we'll likely need one when we fix the EJSON corpus issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So stringified ejson is valid json right? The concern is circularly testing our bson library by using ejson.parse to test ejson.stringify and vice-versa. But could we use JSON.parse?
const input = { a: 1 };
Object.setPrototypeOf(input, { b: 2 });
const string = EJSON.stringify(input);
const parsed = JSON.pase(string)
expect(parsed).to.deep.equal({ a: 1 });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea for this test that's fine, I'll update. but generally JSON.parse is lossy in certain scenarios that do matter for EJSON/BSON correctness, so it's not always viable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, do you have an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept the assertion that checks the b string doesn't exist, since that's what we really care about seems worth keeping, and of course json parse + deep equal confirms it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numbers that exceed the precision supported by javascript
JSON.parse(`{"a":9223372036854775807}`) // { a: 9223372036854776000 }
06ed7c7
to
6af24ad
Compare
const bytes = BSON.serialize({ someKey: [1] }); | ||
const options = { fieldsAsRaw: { someKey: true } }; | ||
Object.setPrototypeOf(options, { promoteValues: false }); | ||
const result = BSON.deserialize(bytes, options); | ||
expect(result).to.have.property('someKey').that.is.an('array'); | ||
expect(result.someKey[0]).to.not.have.property('_bsontype', 'Int32'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to the comment above - this test kind of confusing. It's not immediately apparent why fieldsAsRaw
is necessary when looking at the test. I had to read the code to realize that we only use the for-in loop when fieldsAsRaw
is set to true, so we need that option to trigger the offending code.
context('when the fieldsAsRaw options is present and has a value that corresponds to a key in the object', () => {
it('ignores non-own properties set on the options object', () => {
const bytes = BSON.serialize({ someKey: [1] });
const options = { fieldsAsRaw: { someKey: true } };
Object.setPrototypeOf(options, { promoteValues: false });
const result = BSON.deserialize(bytes, options);
expect(result).to.have.property('someKey').that.is.an('array');
expect(result.someKey[0], 'expected promoteValues option set on options object prototype to be ignored, but it was not').to.not.be.instanceOf(Int32);
});
});
docs/upgrade-to-v5.md
Outdated
|
||
### BSON Element names are now fetched only from object's own properties | ||
|
||
Previously objects passed to the `BSON.serialize`, `BSON.calculateObjectSize`, and `EJSON.stringify` API would have the element names enumerated with a `for-in` loop which will emit keys defined on the prototype. Since this is likely surprising, especially if a globally shared prototype has been modified we are now using `Object.keys` to enumerate the element names from a js object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously objects passed to the `BSON.serialize`, `BSON.calculateObjectSize`, and `EJSON.stringify` API would have the element names enumerated with a `for-in` loop which will emit keys defined on the prototype. Since this is likely surprising, especially if a globally shared prototype has been modified we are now using `Object.keys` to enumerate the element names from a js object. | |
`BSON.serialize`, `EJSON.stringify` and `BSON.calculateObjectSize` now only consider own properties and do not consider properties defined on the prototype of the object when serializing objects. | |
Example: | |
```typescript | |
const object = { a: 1 }; | |
Object.setPrototypeOf(object, { b: 2 }); | |
deserialize(serialize(object)); // { a: 1 } in 5.0, { a: 1, b: 2 } in 4x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated with just a bit of rephrasing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one small upgrade guide suggestion
@nbbeeken Merge conflicts :/ |
Description
What is changing?
Using a key enumeration method that does not include properties defined on object prototypes.
Is there new documentation needed for these changes?
What is the motivation for this change?
Keys defined a prototypes are not always visible or obviously inherited, in an effort to make it clear what a JS object will serialize too we now use APIs that only enumerate "own" properties.
Double check the following
npm run lint
script<type>(NODE-xxxx)<!>: <description>