Skip to content

SWIFT-137 Sketch remainder of CRUD API #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 10, 2018
Merged

SWIFT-137 Sketch remainder of CRUD API #83

merged 1 commit into from
Jul 10, 2018

Conversation

kmahar
Copy link
Contributor

@kmahar kmahar commented Jul 2, 2018

This adds unimplemented methods for bulkWrite, find and modify ops, and the new count API, plus associated structs.

@kmahar kmahar requested review from jmikola and mbroadst July 2, 2018 18:25
* - Returns: The count of the documents that matched the filter
*/
public func countDocuments(_ filter: Document = [:], options: CountOptions? = nil) throws -> Int {
throw MongoError.commandError(message: "Unimplemented command")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a todo linked to a JIRA ticket? Likewise for estimatedDocumentCount below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

*
* - Returns: The count of the documents that matched the filter
*/
public func countDocuments(_ filter: Document = [:], options: CountOptions? = nil) throws -> Int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CountDocumentsOptions instead of CountOptions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to duplicate the struct since it's still used by our (to-be-deprecated) count, and the CRUD spec still calls it CountOptions. are we allowed to deviate from that?

if so I guess we could get around the duplication issue for now by creating a typealias for now, so that CountOptions is also named CountDocumentOptions?

Copy link
Member

@jmikola jmikola Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typealias seems best so we can be consistent about the option structs matching the method names. The CRUD spec edits for new count APIs were done hastily, so I think that explains the oversight.

A more general question may be whether it makes sense to use option structs for these things instead of just adding optional arguments to the methods themselves. It may be worth taking a look at the Python and Ruby drivers for inspiration, as both of them use keyword args AFAIK. PHP doesn't have that, so we take all options in a dictionary arg (comparable to these option structs I suppose).

From the user's perspective, I suppose the question is whether it's more pleasant to create an options struct that gets immediately passed into a method or just specify those options on the method itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I have created a typealias.

from a driver internals perspective, I strongly prefer the struct approach. having an Encodable struct makes it super simple to serialize our options to bson_ts with just one line - try BsonEncoder().encode(options). if they were taken in as arguments, we'd have to either craft/encode the options doc by hand in the method (annoying) or put all the provided args into basically a private options struct in order to encode them correctly - which would lead to us double-defining each option, once as a parameter and then again in the struct.

from a user perspective, it is probably mildly more annoying to create an options struct, particularly in cases where the name is super long:

let count = try collection.estimatedDocumentCount(options: EstimatedDocumentCountOptions(maxTimeMS: 5))

very cumbersome compared to if we just had

let count = try collection.estimatedDocumentCount(maxTimeMS: 5)

but the struct approach does give them the benefit of being able to reuse a set of options more easily, and also makes all the method signatures far more readable IMO.

I would be open to adding some helper methods that take in the params directly in cases where there are only a few, but I'm not sure where we would draw the line on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be open to adding some helper methods that take in the params directly in cases where there are only a few, but I'm not sure where we would draw the line on that.

These are good points. I'm cool with sticking to option structs and not deviating so we can avoid having to decide where a line is. Let's just keep everything consistent.

Folks can always create their options separately to avoid the first example with a super long line of code.

/// Options to use when executing a `findOneAndUpdate` command on a `MongoCollection`.
public struct FindOneAndUpdateOptions {
/// A set of filters specifying to which array elements an update should apply.
public let arrayFilters: [Document]?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, FindOneAndUpdateOptions is basically FindOneAndReplaceOptions with the addition of this option. Since Swift doesn't allow struct inheritance, are we stuck with copy-pasting everything?

The same thought also occurred while looking at EstimatedDocumentCountOptions, which only contains a common maxTimeMS member.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the lack of inheritance for structs is why we have to do this copy-pasting :/

/// The maximum amount of time to allow the query to run.
public let maxTimeMS: Int64?

/// Limits the fields to return for all matching documents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, but "documents" can refer to "document" for all descriptions in the findAndModify option structs since these operations never work with multiple documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. your comment also led me to realize my sort comments didn't make sense (I think I copy pasted that from somewhere else that supports sort...) so I've updated those as well.

*
* - Returns: The deleted document, or `nil` if no document was deleted.
*/
@discardableResult
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that discarding the result doesn't make sense, as users should probably use the single-statement methods (e.g. deleteOne); however, the fact that findAndModify supports a sort option justifies why someone may want to use these methods while still not caring about the return value. 👍

* Sends a batch of writes to the server at the same time.
*
* - Parameters:
* - requests: an `[WriteModel]` containing the writes to perform
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "a" is fine here, unless you were actually writing "an array of WriteModels".

* - requests: an `[WriteModel]` containing the writes to perform
* - options: optional `BulkWriteOptions` to use while executing the operation
* - Returns: a `BulkWriteResult`
* - Throws: an error if `requests` is empty, or if any error occurs while performing the writes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the error type ``MongoError.typeErrorbe noted here? I noticed that was done in forDocument.get` and I did the same in ReadPreference docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added error types. per our discussion earlier, these will most likely change as part of our error refactoring epic, but for now they are consistent w/ the rest of the driver.


/// Options to use when performing a bulk write operation on a `MongoCollection`.
public struct BulkWriteOptions {
/// If `true`, when an insert fails, return without performing the remaining writes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just use /** ... */?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have condensed the third line with the first, five lines for the comment seemed excessive

}

/// The result of a bulk write operation on a `MongoCollection`.
public struct BulkWriteResult: Decodable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to provide accessors for write errors, write concern errors, and whether the write was acknowledged? PHP does that with WriteResult.

If a bulk write execution results in an error, libmongoc will create an error message for us but users should still have a programmatic API to inspect the errors. This is particularly relevant if there are both write and write concern errors.

On a separate note, users may want to know whether a write was acknowledged as some fields should not be accessible otherwise (or they should be nil).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given our conversation earlier, not totally sure how we want to handle this....
would it make sense to throw from MongoCollection.bulkWrite if there is a write error or write concern error? that would mean if we actually get to the point of returning a BulkWriteResult, there are no errors.

Copy link
Member

@jmikola jmikola Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongoCollection.bulkWrite if there is a write error or write concern error?

PHP determines whether to throw an exception based on the return value of mongoc_bulk_operation_execute(). That function does return 0 in the event of a write or write concern error.

I think it's easier to convert the entire libmongoc reply document into a write result struct and stuff that onto an error rather than split the document into separate entities (document for count/ID fields, writeErrors array, and writeConcernError document).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think for an unacknowledged write we should just return nil. we are already doing (or supposed to be doing) that in some other places in the driver, and since we have optionals I think it's more idiomatic. so this method should actually return a BulkWriteResult?.

re: errors, IMO it's kind of confusing to users if there's an error field when there doesn't need to be - they might think they need to manually check each BulkWriteResult they get back to see if writeErrors or writeConcernErrors have anything in them.

instead I propose we add these cases to MongoError:

case writeError(code: Int32, message: String)
case bulkWriteError(result: BulkWriteResult, writeErrors: [Document], writeConcernError: Document?)

so then when mongoc_bulk_operation_execute fails, we will throw a bulkWriteError that stores all the info. when it succeeds (and the write is acknowledged), we'll just return a BulkWriteResult.

it does require a bit of work to split up the info, but I think our BSON decoder will automate some of that, and it saves users having to go down two nested levels (error -> result struct -> error fields) to actually see the errors.

thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think for an unacknowledged write we should just return nil. we are already doing (or supposed to be doing) that in some other places in the driver, and since we have optionals I think it's more idiomatic.

If the Swift driver has no intention of reporting inserted IDs, I suppose BulkWriteResult? makes sensein place of a BulkWriteResult consisting entirely of unset optionals and a single isAcknowledged = false property.

I personally think it's useful for drivers to report inserted IDs, since those are often generated by the driver when a document without an _id field is inserted; however, a reasonable counter-argument is that any users that actually care about the inserted ID could simply set _id themselves before the insert operation.

IMO it's kind of confusing to users if there's an error field when there doesn't need to be - they might think they need to manually check each BulkWriteResult they get back to see if writeErrors or writeConcernErrors have anything in them.

I think that bulkWriteError should still have a message and/or code. libmongoc is already going to provide a bson_error_t for us and we can use that as we normally do. The only extra behavior is that we also want to stuff the bson_t reply into the error as well.

One thought is that BulkWriteResult may need to be optional. Consider the scenario where an unacknowledged write encounters a network error. libmongoc is still going to fill out a bson_error_t even if there isn't anything to report in the bson_t reply.

it does require a bit of work to split up the info, but I think our BSON decoder will automate some of that, and it saves users having to go down two nested levels (error -> result struct -> error fields) to actually see the errors.

I'm personally in favor of BulkWriteResult matching the bson_t reply provided by libmongoc, but if that complicates things I won't object. I imagine not splitting out the writeErrors and writeConcernError fields might prevent us from being able to re-use the same BulkWriteResult struct for both the successful bulkWrite() return value and the bulkWriteError type. In that case, let's aim for re-use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ just updated my comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should writeErrors: [MongoError] be [MongoError]?, or would you ensure that writeErrors is an empty array even if the only bulk write error was the writeConcernError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining we'd just use an empty array, but thinking about it more, I think making it an optional is better. it's nicer to be able to say like if let writeErrs = error.writeErrors { ... } instead of if errors.writeErrors.count > 0 { ... }.

also saves us creating a bunch of unused arrays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's discuss the bulkWrite / unacknowledged stuff in triage as this is blocking further progress on our CRUD work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We concluded to keep the existing insertOne/insertMany behavior and have operations return an optional result object (i.e. nil for {w:0}).

public let insertedIds: [Int: AnyBsonValue]

/// Number of documents matched for update.
public let matchedCount: Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above comment, the update and deleted counts originate from the server and won't be available if the write concern was unacknowledged. I think everything apart from insertedIds (which can be generated entirely client-side) should be optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just remembered that you do have insertedIds here. If that's going to stay, then I think you should probably provide it for unacknowledged writes as well.

import libmongoc

/// An extension of `MongoCollection` encapsulating find and modify operations.
extension MongoCollection {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that I have modified all the find and modify commands to support using the MongoCollection's CollectionType.

@@ -128,7 +128,7 @@ public class WriteConcern: Codable {
let journalStr = String(describing: journal)
let wStr = String(describing: w)
let timeoutStr = String(describing: wtimeoutMS)
throw MongoError.writeConcernError(message:
throw MongoError.writeConcernError(code: 0, message:
Copy link
Member

@jmikola jmikola Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is something we can clear up with SWIFT-145? Ideally, we should use something like an invalidArgumentError here and save writeConcernError for an actual runtime exception (i.e. from the server).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, since I'm here anyway I'll just change this to a MongoError.invalidArgument

@kmahar kmahar merged commit 17728bd into master Jul 10, 2018
@kmahar kmahar deleted the SWIFT-137 branch July 10, 2018 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants