-
Notifications
You must be signed in to change notification settings - Fork 266
PHPLIB-418: Add the ability to specify a pipeline to an update command #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1,4 +1,9 @@ | |||
{ | |||
"runOn": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/Operation/ReplaceOne.php
Outdated
@@ -76,6 +76,10 @@ public function __construct($databaseName, $collectionName, $filter, $replacemen | |||
throw new InvalidArgumentException('First key in $replacement argument is an update operator'); | |||
} | |||
|
|||
if (\MongoDB\is_pipeline($replacement)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense, although I didn't find this discussed in the CRUD spec. In hindsight, I think mongodb/specifications@7246066#diff-5efd2e4ce61f4ccda0ce031099303996 for SPEC-1267 should have addressed this in the CRUD spec's Update vs. Replace Validation section. It could also have added a failing test for replaceOne
and replace via BulkWrite
. Would you mind opening a spec ticket? Hopefully we can get to it during SPEC triage Monday afternoon (NYC time).
Although documents cannot contain keys prefixed with $
, I was wondering if the following replacement document might be ambiguous (at least for our is_pipeline()
function):
[
[ '$ref' => '<any string>' ],
[ '$id' => '<any non-array value>' ],
]
This could be interpreted as a replacement document assigning a DBRef to the document's top-level "0" field, or a pipeline with $ref
and $id
stages. I decided to test this in the mongo
shell:
rs0:PRIMARY> db.foo.drop()
false
rs0:PRIMARY> db.foo.insert({"0":1})
WriteResult({ "nInserted" : 1 })
rs0:PRIMARY> db.foo.findOne()
{ "_id" : ObjectId("5d61d496637adc23cd5c4215"), "0" : 1 }
rs0:PRIMARY> db.foo.replaceOne({},[{"$ref":"foo","$id":"bar"}])
2019-08-24T20:26:51.491-0400 E QUERY [js] uncaught exception: Error: Cannot use pipeline-style updates in a replacement operation :
DBCollection.prototype.replaceOne@src/mongo/shell/crud_api.js:461:15
@(shell):1:1
That supports the validation you have here. Still, it's a client-side error and I was wondering what the server-side error might be if we actually passed this along to the update
command. Does MongoDB 4.2 accept a BSON array as a valid replacement document in the update
command's updates[].u
option? Recent server versions added stricter validation, but I believe in some older servers (perhaps in the 3.x range) a BSON array/document were accepted interchangeably. Let's find out:
rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:[{"$ref":"foo","$id":"bar"}]}]});
{
"n" : 0,
"nModified" : 0,
"writeErrors" : [
{
"index" : 0,
"code" : 40323,
"errmsg" : "A pipeline stage specification object must contain exactly one field."
}
],
...
}
My DBRef example is a total edge case, but let's suppose an application actually did want to perform that replacement for some reason. Passing an actual BSON document (i.e. object cast on the PHP array) does the trick:
rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:{"0":{"$ref":"foo","$id":"bar"}}}]});
{
"n" : 1,
"nModified" : 1,
...
}
I also traced the code path for constructing the update
command and we can see in _mongoc_write_command_update_append()
that libmongoc always appends the u
option as a document unless it's a pipeline. So even if a PHP array was used as a replacement document and managed to sneak past our update/replace validation, we could trust libmongoc to actually send a BSON document to the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for that detailed analysis. I've created a spec ticket to ask for clarification in the spec. From the result of your tests, I deduce that our handling is correct in this case with users having to explicitly pass an object for a replacement that you suggested above.
I agree that it's an edge case, especially considering that DBRef objects normally contain both $ref
and $id
keys. With that in mind, we could add logic to MongoDB\is_pipeline
to assert that each stage in the potential pipeline only contains a single element; this would at least remove the client-side assertion when replacing a document with a single DBRef object, as we would no longer accept that as a valid aggregation pipeline. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked this through early Monday, but the current logic in is_pipeline()
seems fine. If the user actually needs to specify { "0": DBRef(...) }
as a replacement document they can cast it to an object.
Checking that each element within the pipeline array is a document with only a single key seems like overkill for what could be a common code path for update-heavy code.
return false; | ||
} | ||
|
||
if ($pipeline === []) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing from my previous shell testing, it looks like MongoDB 4.2 accepts an empty pipeline and simply NOPs:
rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:[]}]});
{
"n" : 1,
"nModified" : 0,
...
}
Contrast this with an empty replacement document:
rs0:PRIMARY> db.runCommand({update:"foo",updates:[{q:{},u:{}}]});
{
"n" : 1,
"nModified" : 1,
...
}
Is this function based on libmongoc's _mongoc_document_is_pipeline()
? I see that also returns false
if the argument has no elements.
I'm not arguing that this is incorrect. I actually think it's the most intuitive behavior, especially for PHP where users would most commonly pass []
as an empty replacement document; however, I'm curious if this was dictated by the spec. If not, perhaps we should mention it in the SPEC ticket I proposed opening in my earlier comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function based on libmongoc's
_mongoc_document_is_pipeline()
? I see that also returnsfalse
if the argument has no elements.
That is correct. I assumed that it'd be wrong to treat an empty array as an aggregation pipeline, but I also specifically remember one test failing because of that, which is why I added the check after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a note of this in SPEC-1430.
2fa9984
to
4991997
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion for extra tests, but LGTM.
https://jira.mongodb.org/browse/PHPLIB-418