Skip to content

feat!(NODE-4410): only enumerate own properties #527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 5, 2022

Conversation

nbbeeken
Copy link
Contributor

@nbbeeken nbbeeken commented Nov 22, 2022

Description

What is changing?

Using a key enumeration method that does not include properties defined on object prototypes.

Is there new documentation needed for these changes?

What is the motivation for this change?

Keys defined a prototypes are not always visible or obviously inherited, in an effort to make it clear what a JS object will serialize too we now use APIs that only enumerate "own" properties.

Double check the following

  • Ran npm run lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: <type>(NODE-xxxx)<!>: <description>
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

@nbbeeken nbbeeken force-pushed the NODE-4410-no-proto-keys branch from f06d897 to 57b02e9 Compare November 28, 2022 19:46
@nbbeeken nbbeeken marked this pull request as ready for review November 28, 2022 19:47
@nbbeeken nbbeeken force-pushed the NODE-4410-no-proto-keys branch from 57b02e9 to 06ed7c7 Compare November 28, 2022 20:30
@nbbeeken nbbeeken added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Nov 30, 2022
)[n] = options[n as keyof DeserializeOptions];
}
arrayOptions['raw'] = true;
arrayOptions = { ...options, raw: true };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a change we want to make?

iiuc, this line of code copies the options before recursing (presumably so we're not mutating a shared options object in recursive deserialize calls). The for-in loop here is only used to copy the options object, not the document being deserialized.

My understanding of the intention behind this change is that when serializing objects, we only want to serialize own properties on the object so we don't pull in keys from the prototype of the object. This seems unrelated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is related because it has similar surprising side effects when there are keys that you didn't specify yourself effecting the outcomes of your deserialize calls. I tried demonstrating the possible buggy behavior in the test in test/node/parser/deserializer.test.ts. If the global object prototype has been polluted with unexpected keys they can control the behavior of the BSON library without actually needing access to the options object a user passes in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. This impacts all other options too, no?

import { deserialize, serialize } from "./src/bson";

const bytes = serialize({ someKey: { 
    nested: "value"
} });

const opts = {} 
Object.setPrototypeOf(opts, { raw: true });

const result = deserialize(bytes, opts);

console.log(result); 
// { someKey: <Buffer 17 00 00 00 02 6e 65 73 74 65 64 00 06 00 00 00 76 61 6c 75 65 00 00> }

It seems like what we'd want to do is only look for own properties on the options objects too when accessing individual fields (not in scope for this work). My 2-cents would be to remove this from this PR and file a follow up to only consider options that are set as own properties (probably an easy fix by making a shallow clone of the options before we begin any work deserializing / serializing).

If you want to leave this in this PR, that's also okay because technically the scope of the ticket is "Use Object.keys to enumerate keys on a JS object passed to BSON", and this is an instance where we're using a for-in loop. If we leave this work in this PR, we should still file a follow up ticket I think. And I'll comment below, but I think we can make the test easier to understand.

Copy link
Contributor Author

@nbbeeken nbbeeken Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the fix here, but I agree about following up. And yea happy to improve the test, let me know

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. I left a comment about the test below

Comment on lines 10 to 11
expect(Array.from(bytes)).to.include('a'.charCodeAt(0));
expect(Array.from(bytes)).to.not.include('b'.charCodeAt(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead assert equality on the exact serialized buffer (a bson document with exactly one key of a with value of 1)? We have the buffer from hex array helper, so we could easily do something like:

expect(bytes).to.equal(bufferFromHexArray([
  // relevant hex values here
]);

Comment on lines 795 to 800
it('should only enumerate own property keys from input objects', () => {
const input = { a: 1 };
Object.setPrototypeOf(input, { b: 2 });
const string = EJSON.stringify(input);
expect(string).to.include(`"a":`);
expect(string).to.not.include(`"b":`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here as in serializer.test.ts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little different, I can hardcode the JSON as is done in other tests here but the issue I've seen crop up a number of times when working on EJSON is the spacing changes which doesn't actually mean the format really changed. Currently this test isn't space sensitive, ideally I'd like us to add a JSON equal checker that is better at being agnostic about that stuff, we'll likely need one when we fix the EJSON corpus issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So stringified ejson is valid json right? The concern is circularly testing our bson library by using ejson.parse to test ejson.stringify and vice-versa. But could we use JSON.parse?

const input = { a: 1 };
Object.setPrototypeOf(input, { b: 2 });
const string = EJSON.stringify(input);
const parsed = JSON.pase(string)
expect(parsed).to.deep.equal({ a: 1 });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea for this test that's fine, I'll update. but generally JSON.parse is lossy in certain scenarios that do matter for EJSON/BSON correctness, so it's not always viable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, do you have an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the assertion that checks the b string doesn't exist, since that's what we really care about seems worth keeping, and of course json parse + deep equal confirms it

Copy link
Contributor Author

@nbbeeken nbbeeken Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbers that exceed the precision supported by javascript

JSON.parse(`{"a":9223372036854775807}`) // { a: 9223372036854776000 }

@nbbeeken nbbeeken force-pushed the NODE-4410-no-proto-keys branch from 06ed7c7 to 6af24ad Compare November 30, 2022 22:29
Comment on lines 6 to 11
const bytes = BSON.serialize({ someKey: [1] });
const options = { fieldsAsRaw: { someKey: true } };
Object.setPrototypeOf(options, { promoteValues: false });
const result = BSON.deserialize(bytes, options);
expect(result).to.have.property('someKey').that.is.an('array');
expect(result.someKey[0]).to.not.have.property('_bsontype', 'Int32');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to the comment above - this test kind of confusing. It's not immediately apparent why fieldsAsRaw is necessary when looking at the test. I had to read the code to realize that we only use the for-in loop when fieldsAsRaw is set to true, so we need that option to trigger the offending code.

context('when the fieldsAsRaw options is present and has a value that corresponds to a key in the object', () => {
  it('ignores non-own properties set on the options object', () => {
	const bytes = BSON.serialize({ someKey: [1] });
    const options = { fieldsAsRaw: { someKey: true } };
    Object.setPrototypeOf(options, { promoteValues: false });
    const result = BSON.deserialize(bytes, options);
    expect(result).to.have.property('someKey').that.is.an('array');
    expect(result.someKey[0], 'expected promoteValues option set on options object prototype to be ignored, but it was not').to.not.be.instanceOf(Int32);
  });
});


### BSON Element names are now fetched only from object's own properties

Previously objects passed to the `BSON.serialize`, `BSON.calculateObjectSize`, and `EJSON.stringify` API would have the element names enumerated with a `for-in` loop which will emit keys defined on the prototype. Since this is likely surprising, especially if a globally shared prototype has been modified we are now using `Object.keys` to enumerate the element names from a js object.
Copy link
Contributor

@baileympearson baileympearson Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Previously objects passed to the `BSON.serialize`, `BSON.calculateObjectSize`, and `EJSON.stringify` API would have the element names enumerated with a `for-in` loop which will emit keys defined on the prototype. Since this is likely surprising, especially if a globally shared prototype has been modified we are now using `Object.keys` to enumerate the element names from a js object.
`BSON.serialize`, `EJSON.stringify` and `BSON.calculateObjectSize` now only consider own properties and do not consider properties defined on the prototype of the object when serializing objects.
Example:
```typescript
const object = { a: 1 };
Object.setPrototypeOf(object, { b: 2 });
deserialize(serialize(object)); // { a: 1 } in 5.0, { a: 1, b: 2 } in 4x

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated with just a bit of rephrasing

Copy link
Contributor

@baileympearson baileympearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small upgrade guide suggestion

baileympearson
baileympearson previously approved these changes Dec 5, 2022
@baileympearson
Copy link
Contributor

@nbbeeken Merge conflicts :/

@baileympearson baileympearson added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Dec 5, 2022
baileympearson
baileympearson previously approved these changes Dec 5, 2022
@baileympearson baileympearson merged commit 5103e4d into main Dec 5, 2022
@baileympearson baileympearson deleted the NODE-4410-no-proto-keys branch December 5, 2022 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants