Skip to content

CDRIVER-4363 add nsInfo builder #1584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 26, 2024

Conversation

kevinAlbs
Copy link
Collaborator

@kevinAlbs kevinAlbs commented Apr 25, 2024

Summary

This PR adds a private component to build the nsInfo payload for CDRIVER-4363. No publicly visible changes are expected.

  • Add private type mcd_nsinfo_t to build the nsInfo payload in the bulkWrite command.
  • Add uthash 2.3.0.

Background & Motivation

The mcd_nsinfo_t is intended for upcoming support of the bulkWrite command. Each operation in a bulkWrite references an int32 index into an array of unique namespaces:

{
    "bulkWrite": 1,
    "ops": [
        {
            "insert": 0, # references 'db.coll1'
            "document": { "_id": "foo" }
        },
        {
            "insert": 0, # references 'db.coll1'
            "document": { "_id": "bar" }
        },
        {
            "insert": 1, # references 'db.coll2'
            "document": { "_id": "baz" }
        }
    ],
    "nsInfo": [
        { "ns": "db.coll1" },
        { "ns": "db.coll2" }
    ]
}

The syntax of the bulkWrite command is further described in Scope: Server improved bulk write command.

Each operation can refer to a different namespace. There can be up to maxWriteBatchSize operations in one command. The server currently defines maxWriteBatchSize in the response to the hello command as 100,000.

My initial implementation stored a map of namespace to int32 in a bson_t and did linear look-up. This resulted in an O(n^2) algorithm which took very long to execute for a large number of namespaces (I stopped waiting after several minutes).

A hash table with faster look-up seemed needed. uthash was chosen for simplicity (small header-only library) and consistency with other dependencies used (utlist.h is already copied).

This is an unmodified copy
Intended to be used to construct the `nsInfo` payload for the upcoming `bulkWrite` command
@kevinAlbs kevinAlbs marked this pull request as ready for review April 26, 2024 11:51
This enables keeping an unmodified copy of the `uthash.h` file.
To include `uthash-2.3.0` in the include path of `#include <uthash-2.3.0/uthash.h>`. This may clarify the expected header location for readers.
Copy link
Collaborator Author

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestions.

utlist.h has also been updated to 2.3.0.

Patch build to verify: https://spruce.mongodb.com/version/662bc956c099380007552450

@kevinAlbs kevinAlbs requested a review from eramongodb April 26, 2024 15:50
@kevinAlbs kevinAlbs merged commit 58aa8c3 into mongodb:master Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants