Skip to content

CDRIVER-5925 apply batchSize:0 to aggregate in change stream #1909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 13, 2025

Conversation

kevinAlbs
Copy link
Collaborator

@kevinAlbs kevinAlbs commented Mar 11, 2025

Summary

Apply batchSize:0 to aggregate for change streams.

Tested with this patch build.

Background & Motivation

WRITING-30021 reported an outage due to runaway aggregate operations overwhelming the server.

This PR supports setting batchSize:0 on the aggregate command sent to create a change stream cursor. This is intended to help the driver kill the server-side cursor when destroying the client handle.

Sending batchSize:0 can help cancel runaway aggregate commands on the server. batchSize:0 results in aggregate immediately returning with a cursor ID and no results. Knowing the cursor ID enables mongoc_change_stream_destroy to send the killCursors command to kill the server-side cursor.

Using batchSize:0 for aggregate is supported as a way to create a server-side cursor immediately. Quoting HELP-29972:

Both the aggregate command and find command support establishing a cursor with batchSize:0 exactly for this reason

WRITING-10467 proposes a long-term solution for all drivers: a new option emptyInitialBatch. The new option would also permit a non-default batchSize for getMore. A non-default batchSize for getMore is not needed to address the reported issue. This PR is intended to be a simpler short-term solution to address the urgent issue while avoiding a conflict with the spec.

Existing behavior with batchSize:0

The C driver currently ignores batchSize:0 option when creating a change stream. This results in using the server default for both the aggregate and getMore commands. Behavior for batchSize:0 differs between drivers.

In the C/C++ driver (full example):

mongocxx::options::change_stream options;
options.batch_size(0);
mongocxx::change_stream stream = collection.watch(options); // Does not send batchSize for aggregate or getMore.

In PyMongo (full example):

with collection.watch(batch_size=0) as stream: # Sends batchSize:0 for aggregate. Does not send batchSize for getMore.

This PR changes the C driver to behave as PyMongo.

Caveats

This is a subtle behavior change. Specifying batchSize:0 for change streams in the C driver is currently ignored. If a user were (unexpectedly?) setting batchSize:0, this may result in an additional needed round-trip to get initial results (the first aggregate would now return empty results). I expect this is low risk to cause negative impact.

Does not impact public docs. Needed to distinguish "set to 0" from "unset".
Clarify this is the requested limit of documents to be returned by the server. Note `batchSize:0` as a special case.
@kevinAlbs kevinAlbs marked this pull request as ready for review March 11, 2025 15:44
@kevinAlbs kevinAlbs requested review from a user and mdb-ad March 11, 2025 15:44
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@kevinAlbs kevinAlbs merged commit 27d2a68 into mongodb:master Mar 13, 2025
40 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants