PHPLIB-418: Add the ability to specify a pipeline to an update command #670

alcaeus · 2019-08-22T12:16:54Z

https://jira.mongodb.org/browse/PHPLIB-418

alcaeus · 2019-08-22T13:13:38Z

tests/SpecTests/crud/updateWithPipelines.json

@@ -1,4 +1,9 @@
 {
+  "runOn": [


See mongodb/specifications#635.

docs/includes/apiargs-MongoDBCollection-common-param.yaml

src/Operation/FindAndModify.php

jmikola · 2019-08-25T00:33:01Z

src/Operation/ReplaceOne.php

@@ -76,6 +76,10 @@ public function __construct($databaseName, $collectionName, $filter, $replacemen
            throw new InvalidArgumentException('First key in $replacement argument is an update operator');
        }

+        if (\MongoDB\is_pipeline($replacement)) {


I think this makes sense, although I didn't find this discussed in the CRUD spec. In hindsight, I think mongodb/specifications@7246066#diff-5efd2e4ce61f4ccda0ce031099303996 for SPEC-1267 should have addressed this in the CRUD spec's Update vs. Replace Validation section. It could also have added a failing test for replaceOne and replace via BulkWrite. Would you mind opening a spec ticket? Hopefully we can get to it during SPEC triage Monday afternoon (NYC time).

Although documents cannot contain keys prefixed with $, I was wondering if the following replacement document might be ambiguous (at least for our is_pipeline() function):

[ [ '$ref' => '<any string>' ], [ '$id' => '<any non-array value>' ], ]

This could be interpreted as a replacement document assigning a DBRef to the document's top-level "0" field, or a pipeline with $ref and $id stages. I decided to test this in the mongo shell:

rs0:PRIMARY> db.foo.drop() false rs0:PRIMARY> db.foo.insert({"0":1}) WriteResult({ "nInserted" : 1 }) rs0:PRIMARY> db.foo.findOne() { "_id" : ObjectId("5d61d496637adc23cd5c4215"), "0" : 1 } rs0:PRIMARY> db.foo.replaceOne({},[{"$ref":"foo","$id":"bar"}]) 2019-08-24T20:26:51.491-0400 E QUERY [js] uncaught exception: Error: Cannot use pipeline-style updates in a replacement operation : DBCollection.prototype.replaceOne@src/mongo/shell/crud_api.js:461:15 @(shell):1:1

That supports the validation you have here. Still, it's a client-side error and I was wondering what the server-side error might be if we actually passed this along to the update command. Does MongoDB 4.2 accept a BSON array as a valid replacement document in the update command's updates[].u option? Recent server versions added stricter validation, but I believe in some older servers (perhaps in the 3.x range) a BSON array/document were accepted interchangeably. Let's find out:

rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:[{"$ref":"foo","$id":"bar"}]}]}); { "n" : 0, "nModified" : 0, "writeErrors" : [ { "index" : 0, "code" : 40323, "errmsg" : "A pipeline stage specification object must contain exactly one field." } ], ... }

My DBRef example is a total edge case, but let's suppose an application actually did want to perform that replacement for some reason. Passing an actual BSON document (i.e. object cast on the PHP array) does the trick:

rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:{"0":{"$ref":"foo","$id":"bar"}}}]}); { "n" : 1, "nModified" : 1, ... }

I also traced the code path for constructing the update command and we can see in _mongoc_write_command_update_append() that libmongoc always appends the u option as a document unless it's a pipeline. So even if a PHP array was used as a replacement document and managed to sneak past our update/replace validation, we could trust libmongoc to actually send a BSON document to the server.

Thank you for that detailed analysis. I've created a spec ticket to ask for clarification in the spec. From the result of your tests, I deduce that our handling is correct in this case with users having to explicitly pass an object for a replacement that you suggested above.

I agree that it's an edge case, especially considering that DBRef objects normally contain both $ref and $id keys. With that in mind, we could add logic to MongoDB\is_pipeline to assert that each stage in the potential pipeline only contains a single element; this would at least remove the client-side assertion when replacing a document with a single DBRef object, as we would no longer accept that as a valid aggregation pipeline. What do you think?

I think we talked this through early Monday, but the current logic in is_pipeline() seems fine. If the user actually needs to specify { "0": DBRef(...) } as a replacement document they can cast it to an object.

Checking that each element within the pipeline array is a document with only a single key seems like overkill for what could be a common code path for update-heavy code.

jmikola · 2019-08-25T00:45:55Z

src/functions.php

+        return false;
+    }
+
+    if ($pipeline === []) {


Continuing from my previous shell testing, it looks like MongoDB 4.2 accepts an empty pipeline and simply NOPs:

rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:[]}]}); { "n" : 1, "nModified" : 0, ... }

Contrast this with an empty replacement document:

rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:{}}]}); { "n" : 1, "nModified" : 1, ... }

Is this function based on libmongoc's _mongoc_document_is_pipeline()? I see that also returns false if the argument has no elements.

I'm not arguing that this is incorrect. I actually think it's the most intuitive behavior, especially for PHP where users would most commonly pass [] as an empty replacement document; however, I'm curious if this was dictated by the spec. If not, perhaps we should mention it in the SPEC ticket I proposed opening in my earlier comment.

Is this function based on libmongoc's _mongoc_document_is_pipeline()? I see that also returns false if the argument has no elements.

That is correct. I assumed that it'd be wrong to treat an empty array as an aggregation pipeline, but I also specifically remember one test failing because of that, which is why I added the check after all.

I made a note of this in SPEC-1430.

src/functions.php

jmikola

A suggestion for extra tests, but LGTM.

docs/includes/apiargs-MongoDBCollection-common-param.yaml

tests/FunctionsTest.php

alcaeus requested a review from jmikola August 22, 2019 12:16

alcaeus self-assigned this Aug 22, 2019

alcaeus force-pushed the phplib-418 branch from 07eb60b to 4cfdcd9 Compare August 22, 2019 12:41

alcaeus commented Aug 22, 2019

View reviewed changes

tests/SpecTests/crud/updateWithPipelines.json

@@ -1,4 +1,9 @@

{

"runOn": [

Copy link

Member Author

alcaeus Aug 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See mongodb/specifications#635.

alcaeus force-pushed the phplib-418 branch from a665a23 to d4a6e6e Compare August 23, 2019 07:35

jmikola reviewed Aug 25, 2019

View reviewed changes

alcaeus force-pushed the phplib-418 branch 3 times, most recently from 2fa9984 to 4991997 Compare August 26, 2019 11:14

jmikola approved these changes Aug 27, 2019

View reviewed changes

docs/includes/apiargs-MongoDBCollection-common-param.yaml Outdated Show resolved Hide resolved

tests/FunctionsTest.php Outdated Show resolved Hide resolved

alcaeus added 2 commits August 27, 2019 08:56

PHPLIB-418: Add the ability to specify a pipeline to an update command

2628c09

PHPLIB-418: Fix wrong runOn syntax in crud/updateWithPiplines spec tests

fc87768

alcaeus force-pushed the phplib-418 branch from 4991997 to fc87768 Compare August 27, 2019 06:56

alcaeus added a commit that referenced this pull request Aug 27, 2019

Merge pull request #670

fcd0d78

alcaeus merged commit fc87768 into mongodb:master Aug 27, 2019

alcaeus deleted the phplib-418 branch August 27, 2019 12:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PHPLIB-418: Add the ability to specify a pipeline to an update command #670

PHPLIB-418: Add the ability to specify a pipeline to an update command #670

Uh oh!

alcaeus commented Aug 22, 2019 •

edited

Loading

Uh oh!

alcaeus Aug 22, 2019

Uh oh!

Uh oh!

Uh oh!

jmikola Aug 25, 2019

Uh oh!

alcaeus Aug 26, 2019

Uh oh!

jmikola Aug 27, 2019

Uh oh!

jmikola Aug 25, 2019

Uh oh!

alcaeus Aug 26, 2019

Uh oh!

jmikola Aug 27, 2019

Uh oh!

Uh oh!

jmikola left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PHPLIB-418: Add the ability to specify a pipeline to an update command #670

PHPLIB-418: Add the ability to specify a pipeline to an update command #670

Uh oh!

Conversation

alcaeus commented Aug 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alcaeus Aug 22, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jmikola Aug 25, 2019

Choose a reason for hiding this comment

Uh oh!

alcaeus Aug 26, 2019

Choose a reason for hiding this comment

Uh oh!

jmikola Aug 27, 2019

Choose a reason for hiding this comment

Uh oh!

jmikola Aug 25, 2019

Choose a reason for hiding this comment

Uh oh!

alcaeus Aug 26, 2019

Choose a reason for hiding this comment

Uh oh!

jmikola Aug 27, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jmikola left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alcaeus commented Aug 22, 2019 •

edited

Loading