Skip to content

perf(NODE-4726): bench many strings in large batch in parallel #516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 26, 2022

Conversation

nbbeeken
Copy link
Contributor

@nbbeeken nbbeeken commented Oct 18, 2022

Description

What is changing?

There's still some cruft in here, but its intentional in case someone else want to pull it down and check my work.
The issue reported does reproduce but only when we try to simulate parallelism. I don't think it is the parallelism that effects our BSON library it just offers enough load to be able to measure the difference.

The goal here isn't to fix anything but some hypothesis, some ruled out:

  • getValidatedString moved the buffer.toString to a helper, undoing this did not demonstrate a change
  • removing all the UTF8 key validation logic may have introduced overhead of allocating a Set, checking has etc.
    • removing it did not appear to have an impact
  • The analysis that I would recommend in a follow up is taking a close look at globals / new structures added to deserializeObject that don't exist in v1. Some known ones are:
    • DBRef shape checking
    • DataView for double parsing
    • UTF8 stuff ^

What is the motivation for this change?

This benchmark is meant to demonstrate the perf difference so we can follow up with fixes that can be proven to be effective.

Double check the following

  • Ran npm run lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: <type>(NODE-xxxx)<!>: <description>
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

@nbbeeken nbbeeken marked this pull request as ready for review October 19, 2022 18:36
@nbbeeken nbbeeken changed the title perf(NODE-4726): bench many strings in large batch in parallel [WIP] perf(NODE-4726): bench many strings in large batch in parallel Oct 19, 2022
@bajanam bajanam requested a review from dariakp October 24, 2022 13:27
@bajanam bajanam added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Oct 24, 2022
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR needs a rebase and a fresh re-install of dependencies.

Let's make sure that we capture the suggested approach in the description in a comment on the follow up ticket (either NODE-4283 or a new one that captures a more specific perf improvement)

Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor thing here

@nbbeeken nbbeeken force-pushed the NODE-4726/bson-bench-strings-arrays branch from 46a6ff7 to d5ca5ae Compare October 25, 2022 21:45
@nbbeeken nbbeeken requested a review from dariakp October 25, 2022 21:51
@@ -68,7 +69,8 @@
"tsd": "^0.24.1",
"typescript": "^4.8.4",
"typescript-cached-transpile": "0.0.6",
"uuid": "^9.0.0"
"uuid": "^9.0.0",
"v8-profiler-next": "^1.9.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have added a lot to the lock file I'm happy to make this a "must install before launching benchmaks" style dep, much like the other versions of the BSON lib are installed with --no-save

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no strong preferences here since it's a dev dependency; though if we're intending to run perf tests in the pipeline, may be worth keeping just to avoid surprise failures

@dariakp dariakp added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Oct 25, 2022
Comment on lines +122 to +127
new Promise(resolve => {
setTimeout(() => {
resolve(bson.lib.deserialize(document, { validation: { utf8: false } }));
}, 20);
})
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the setTimeout here? And why did you choose a setTimeout of 20ms specifically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It manufactures the delay that the driver does when we promise.all many many finds, 20ms is arbitrary, kept it low so the bench runs in a reasonable amount of time

@dariakp dariakp merged commit 62c7977 into main Oct 26, 2022
@dariakp dariakp deleted the NODE-4726/bson-bench-strings-arrays branch October 26, 2022 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants